annotate tools/protein_analysis/signalp3.py @ 23:e1996f0f4e85 draft default tip

"v0.2.13 - Python 3 fix for raising StopIteration"
author peterjc
date Thu, 17 Jun 2021 17:59:33 +0000
parents 238eae32483c
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
1 #!/usr/bin/env python
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
2 """Wrapper for SignalP v3.0 for use in Galaxy.
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
3
5
0f1c61998b22 Migrated tool version 0.0.8 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
4 This script takes exactly five command line arguments:
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
5 * the organism type (euk, gram+ or gram-)
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
6 * length to truncate sequences to (integer)
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
7 * number of threads to use (integer, defaults to one)
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
8 * an input protein FASTA filename
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
9 * output tabular filename.
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
10
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
11 There are two further optional arguments
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
12 * cut type (NN_Cmax, NN_Ymax, NN_Smax or HMM_Cmax)
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
13 * output GFF3 filename
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
14
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
15 It then calls the standalone SignalP v3.0 program (not the webservice)
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
16 requesting the short output (one line per protein) using both NN and HMM
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
17 for predictions.
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
18
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
19 First major feature is cleaning up the output. The raw output from SignalP
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
20 v3.0 looks like this (21 columns space separated):
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
21
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
22 # SignalP-NN euk predictions # SignalP-HMM euk predictions
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
23 # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? D ? # name ! Cmax pos ? Sprob ?
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
24 gi|2781234|pdb|1JLY| 0.061 17 N 0.043 17 N 0.199 1 N 0.067 N 0.055 N gi|2781234|pdb|1JLY|B Q 0.000 17 N 0.000 N
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
25 gi|4959044|gb|AAD342 0.099 191 N 0.012 38 N 0.023 12 N 0.014 N 0.013 N gi|4959044|gb|AAD34209.1|AF069992_1 Q 0.000 0 N 0.000 N
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
26 gi|671626|emb|CAA856 0.139 381 N 0.020 8 N 0.121 4 N 0.067 N 0.044 N gi|671626|emb|CAA85685.1| Q 0.000 0 N 0.000 N
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
27 gi|3298468|dbj|BAA31 0.208 24 N 0.184 38 N 0.980 32 Y 0.613 Y 0.398 N gi|3298468|dbj|BAA31520.1| Q 0.066 24 N 0.139 N
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
28
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
29 In order to make it easier to use in Galaxy, this wrapper script reformats
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
30 this to use tab separators. Also it removes the redundant truncated name
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
31 column, and assigns unique column names in the header:
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
32
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
33 #ID NN_Cmax_score NN_Cmax_pos NN_Cmax_pred NN_Ymax_score NN_Ymax_pos NN_Ymax_pred NN_Smax_score NN_Smax_pos NN_Smax_pred NN_Smean_score NN_Smean_pred NN_D_score NN_D_pred HMM_bang HMM_Cmax_score HMM_Cmax_pos HMM_Cmax_pred HMM_Sprob_score HMM_Sprob_pred
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
34 gi|2781234|pdb|1JLY|B 0.061 17 N 0.043 17 N 0.199 1 N 0.067 N 0.055 N Q 0.000 17 N 0.000 N
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
35 gi|4959044|gb|AAD34209.1|AF069992_1 0.099 191 N 0.012 38 N 0.023 12 N 0.014 N 0.013 N Q 0.000 0 N 0.000 N
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
36 gi|671626|emb|CAA85685.1| 0.139 381 N 0.020 8 N 0.121 4 N 0.067 N 0.044 N Q 0.000 0 N 0.000 N
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
37 gi|3298468|dbj|BAA31520.1| 0.208 24 N 0.184 38 N 0.980 32 Y 0.613 Y 0.398 N Q 0.066 24 N 0.139 N
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
38
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
39 The second major feature is overcoming SignalP's built in limit of 4000
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
40 sequences by breaking up the input FASTA file into chunks. This also allows
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
41 us to pre-trim the sequences since SignalP only needs their starts.
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
42
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
43 The third major feature is taking advantage of multiple cores (since SignalP
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
44 v3.0 itself is single threaded) by using the individual FASTA input files to
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
45 run multiple copies of TMHMM in parallel. I would normally use Python's
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
46 multiprocessing library in this situation but it requires at least Python 2.6
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
47 and at the time of writing Galaxy still supports Python 2.4.
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
48
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
49 Note that this is somewhat redundant with job-splitting available in Galaxy
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
50 itself (see the SignalP XML file for settings).
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
51
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
52 Finally, you can opt to have a GFF3 file produced which will describe the
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
53 predicted signal peptide and mature peptide for each protein (using one of
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
54 the predictors which gives a cleavage site). *WORK IN PROGRESS*
20
a19b3ded8f33 v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents: 19
diff changeset
55 """ # noqa: E501
a19b3ded8f33 v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents: 19
diff changeset
56
a19b3ded8f33 v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents: 19
diff changeset
57 from __future__ import print_function
a19b3ded8f33 v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents: 19
diff changeset
58
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
59 import os
20
a19b3ded8f33 v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents: 19
diff changeset
60 import sys
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
61 import tempfile
20
a19b3ded8f33 v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents: 19
diff changeset
62
a19b3ded8f33 v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents: 19
diff changeset
63 from seq_analysis_utils import fasta_iterator, split_fasta
9
e52220a9ddad Uploaded v0.1.2
peterjc
parents: 7
diff changeset
64 from seq_analysis_utils import run_jobs, thread_count
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
65
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
66 FASTA_CHUNK = 500
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
67 MAX_LEN = 6000 # Found by trial and error
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
68
20
a19b3ded8f33 v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents: 19
diff changeset
69 if "-v" in sys.argv or "--version" in sys.argv:
23
e1996f0f4e85 "v0.2.13 - Python 3 fix for raising StopIteration"
peterjc
parents: 21
diff changeset
70 print("SignalP Galaxy wrapper version 0.0.20")
20
a19b3ded8f33 v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents: 19
diff changeset
71 sys.exit(os.system("signalp -version"))
a19b3ded8f33 v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents: 19
diff changeset
72
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
73 if len(sys.argv) not in [6, 8]:
21
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
74 sys.exit(
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
75 "Require five (or 7) arguments, organism, truncate, threads, "
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
76 "input protein FASTA file & output tabular file (plus "
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
77 "optionally cut method and GFF3 output file). "
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
78 "Got %i arguments." % (len(sys.argv) - 1)
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
79 )
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
80
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
81 organism = sys.argv[1]
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
82 if organism not in ["euk", "gram+", "gram-"]:
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
83 sys.exit("Organism argument %s is not one of euk, gram+ or gram-" % organism)
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
84
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
85 try:
11
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
86 truncate = int(sys.argv[2])
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
87 except ValueError:
11
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
88 truncate = 0
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
89 if truncate < 0:
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
90 sys.exit("Truncate argument %s is not a positive integer (or zero)" % sys.argv[2])
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
91
9
e52220a9ddad Uploaded v0.1.2
peterjc
parents: 7
diff changeset
92 num_threads = thread_count(sys.argv[3], default=4)
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
93 fasta_file = sys.argv[4]
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
94 tabular_file = sys.argv[5]
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
95
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
96 if len(sys.argv) == 8:
11
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
97 cut_method = sys.argv[6]
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
98 if cut_method not in ["NN_Cmax", "NN_Ymax", "NN_Smax", "HMM_Cmax"]:
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
99 sys.exit("Invalid cut method %r" % cut_method)
11
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
100 gff3_file = sys.argv[7]
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
101 else:
11
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
102 cut_method = None
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
103 gff3_file = None
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
104
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
105
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
106 tmp_dir = tempfile.mkdtemp()
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
107
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
108
20
a19b3ded8f33 v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents: 19
diff changeset
109 def clean_tabular(raw_handle, out_handle, gff_handle=None):
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
110 """Clean up SignalP output to make it tabular."""
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
111 for line in raw_handle:
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
112 if not line or line.startswith("#"):
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
113 continue
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
114 parts = line.rstrip("\r\n").split()
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
115 assert len(parts) == 21, repr(line)
21
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
116 assert parts[14].startswith(parts[0]), (
11
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
117 "Bad entry in SignalP output, ID miss-match:\n%r" % line
21
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
118 )
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
119 # Remove redundant truncated name column (col 0)
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
120 # and put full name at start (col 14)
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
121 parts = parts[14:15] + parts[1:14] + parts[15:]
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
122 out_handle.write("\t".join(parts) + "\n")
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
123
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
124
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
125 def make_gff(fasta_file, tabular_file, gff_file, cut_method):
20
a19b3ded8f33 v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents: 19
diff changeset
126 """Make a GFF file."""
21
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
127 cut_col, score_col = {
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
128 "NN_Cmax": (2, 1),
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
129 "NN_Ymax": (5, 4),
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
130 "NN_Smax": (8, 7),
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
131 "HMM_Cmax": (16, 15),
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
132 }[cut_method]
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
133
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
134 source = "SignalP"
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
135 strand = "." # not stranded
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
136 phase = "." # not phased
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
137 tags = "Note=%s" % cut_method
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
138
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
139 tab_handle = open(tabular_file)
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
140 line = tab_handle.readline()
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
141 assert line.startswith("#ID\t"), line
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
142
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
143 gff_handle = open(gff_file, "w")
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
144 gff_handle.write("##gff-version 3\n")
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
145
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
146 for (title, seq), line in zip(fasta_iterator(fasta_file), tab_handle):
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
147 parts = line.rstrip("\n").split("\t")
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
148 seqid = parts[0]
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
149 assert title.startswith(seqid), "%s vs %s" % (seqid, title)
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
150 if not seq:
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
151 # Is it possible to have a zero length reference in GFF3?
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
152 continue
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
153 cut = int(parts[cut_col])
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
154 if cut == 0:
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
155 assert cut_method == "HMM_Cmax", cut_method
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
156 # TODO - Why does it do this?
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
157 cut = 1
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
158 assert 1 <= cut <= len(seq), "%i for %s len %i" % (cut, seqid, len(seq))
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
159 score = parts[score_col]
21
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
160 gff_handle.write("##sequence-region %s %i %i\n" % (seqid, 1, len(seq)))
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
161 # If the cut is at the very begining, there is no signal peptide!
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
162 if cut > 1:
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
163 # signal_peptide = SO:0000418
21
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
164 gff_handle.write(
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
165 "%s\t%s\t%s\t%i\t%i\t%s\t%s\t%s\t%s\n"
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
166 % (
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
167 seqid,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
168 source,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
169 "signal_peptide",
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
170 1,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
171 cut - 1,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
172 score,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
173 strand,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
174 phase,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
175 tags,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
176 )
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
177 )
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
178 # mature_protein_region = SO:0000419
21
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
179 gff_handle.write(
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
180 "%s\t%s\t%s\t%i\t%i\t%s\t%s\t%s\t%s\n"
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
181 % (
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
182 seqid,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
183 source,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
184 "mature_protein_region",
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
185 cut,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
186 len(seq),
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
187 score,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
188 strand,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
189 phase,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
190 tags,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
191 )
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
192 )
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
193 tab_handle.close()
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
194 gff_handle.close()
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
195
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
196
21
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
197 if num_threads == 1:
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
198 # Still want to call split_fasta to apply truncation, but
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
199 # no reason to make multiple files - and more chance of
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
200 # hitting file system glitches if we do. So,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
201 FASTA_CHUNK = sys.maxsize
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
202
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
203 fasta_files = split_fasta(
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
204 fasta_file,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
205 os.path.join(tmp_dir, "signalp"),
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
206 n=FASTA_CHUNK,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
207 truncate=truncate,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
208 max_len=MAX_LEN,
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
209 )
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
210 temp_files = [f + ".out" for f in fasta_files]
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
211 assert len(fasta_files) == len(temp_files)
21
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
212 jobs = [
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
213 "signalp -short -t %s %s > %s" % (organism, fasta, temp)
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
214 for (fasta, temp) in zip(fasta_files, temp_files)
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
215 ]
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
216 assert len(fasta_files) == len(temp_files) == len(jobs)
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
217
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
218
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
219 def clean_up(file_list):
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
220 """Remove temp files, and if possible the temp directory."""
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
221 for f in file_list:
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
222 if os.path.isfile(f):
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
223 os.remove(f)
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
224 try:
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
225 os.rmdir(tmp_dir)
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
226 except Exception:
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
227 pass
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
228
20
a19b3ded8f33 v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents: 19
diff changeset
229
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
230 if len(jobs) > 1 and num_threads > 1:
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
231 # A small "info" message for Galaxy to show the user.
20
a19b3ded8f33 v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents: 19
diff changeset
232 print("Using %i threads for %i tasks" % (min(num_threads, len(jobs)), len(jobs)))
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
233 results = run_jobs(jobs, num_threads)
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
234 assert len(fasta_files) == len(temp_files) == len(jobs)
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
235 for fasta, temp, cmd in zip(fasta_files, temp_files, jobs):
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
236 error_level = results[cmd]
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
237 try:
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
238 output = open(temp).readline()
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
239 except IOError:
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
240 output = "(no output)"
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
241 if error_level or output.lower().startswith("error running"):
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
242 clean_up(fasta_files + temp_files)
20
a19b3ded8f33 v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents: 19
diff changeset
243 if output:
21
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
244 sys.stderr.write(
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
245 "One or more tasks failed, e.g. %i from %r gave:\n%s"
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
246 % (error_level, cmd, output)
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
247 )
20
a19b3ded8f33 v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents: 19
diff changeset
248 else:
21
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
249 sys.stderr.write(
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
250 "One or more tasks failed, e.g. %i from %r with no output\n"
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
251 % (error_level, cmd)
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
252 )
20
a19b3ded8f33 v0.2.11 Job splitting fast-fail; RXLR tools supports HMMER2 from BioConda; Capture more version information; misc internal changes
peterjc
parents: 19
diff changeset
253 sys.exit(error_level)
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
254 del results
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
255
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
256 out_handle = open(tabular_file, "w")
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
257 fields = ["ID"]
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
258 # NN results:
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
259 for name in ["Cmax", "Ymax", "Smax"]:
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
260 fields.extend(["NN_%s_score" % name, "NN_%s_pos" % name, "NN_%s_pred" % name])
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
261 fields.extend(["NN_Smean_score", "NN_Smean_pred", "NN_D_score", "NN_D_pred"])
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
262 # HMM results:
21
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
263 fields.extend(
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
264 [
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
265 "HMM_type",
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
266 "HMM_Cmax_score",
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
267 "HMM_Cmax_pos",
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
268 "HMM_Cmax_pred",
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
269 "HMM_Sprob_score",
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
270 "HMM_Sprob_pred",
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
271 ]
238eae32483c "Check this is up to date with all 2020 changes (black etc)"
peterjc
parents: 20
diff changeset
272 )
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
273 out_handle.write("#" + "\t".join(fields) + "\n")
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
274 for temp in temp_files:
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
275 data_handle = open(temp)
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
276 clean_tabular(data_handle, out_handle)
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
277 data_handle.close()
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
278 out_handle.close()
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
279
19
f3ecd80850e2 v0.2.9 Python style improvements
peterjc
parents: 18
diff changeset
280 # GFF3:
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
281 if cut_method:
11
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
282 make_gff(fasta_file, tabular_file, gff3_file, cut_method)
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
283
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 5
diff changeset
284 clean_up(fasta_files + temp_files)