Mercurial > repos > peterjc > tmhmm_and_signalp
annotate tools/protein_analysis/signalp3.py @ 7:9b45a8743100 draft
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
author | peterjc |
---|---|
date | Mon, 30 Jul 2012 10:25:07 -0400 |
parents | 0f1c61998b22 |
children | e52220a9ddad |
rev | line source |
---|---|
0
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
1 #!/usr/bin/env python |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
2 """Wrapper for SignalP v3.0 for use in Galaxy. |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
3 |
5
0f1c61998b22
Migrated tool version 0.0.8 from old tool shed archive to new tool shed repository
peterjc
parents:
0
diff
changeset
|
4 This script takes exactly five command line arguments: |
0
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
5 * the organism type (euk, gram+ or gram-) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
6 * length to truncate sequences to (integer) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
7 * number of threads to use (integer, defaults to one) |
0
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
8 * an input protein FASTA filename |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
9 * output tabular filename. |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
10 |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
11 There are two further optional arguments |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
12 * cut type (NN_Cmax, NN_Ymax, NN_Smax or HMM_Cmax) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
13 * output GFF3 filename |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
14 |
0
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
15 It then calls the standalone SignalP v3.0 program (not the webservice) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
16 requesting the short output (one line per protein) using both NN and HMM |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
17 for predictions. |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
18 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
19 First major feature is cleaning up the output. The raw output from SignalP |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
20 v3.0 looks like this (21 columns space separated): |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
21 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
22 # SignalP-NN euk predictions # SignalP-HMM euk predictions |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
23 # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? D ? # name ! Cmax pos ? Sprob ? |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
24 gi|2781234|pdb|1JLY| 0.061 17 N 0.043 17 N 0.199 1 N 0.067 N 0.055 N gi|2781234|pdb|1JLY|B Q 0.000 17 N 0.000 N |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
25 gi|4959044|gb|AAD342 0.099 191 N 0.012 38 N 0.023 12 N 0.014 N 0.013 N gi|4959044|gb|AAD34209.1|AF069992_1 Q 0.000 0 N 0.000 N |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
26 gi|671626|emb|CAA856 0.139 381 N 0.020 8 N 0.121 4 N 0.067 N 0.044 N gi|671626|emb|CAA85685.1| Q 0.000 0 N 0.000 N |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
27 gi|3298468|dbj|BAA31 0.208 24 N 0.184 38 N 0.980 32 Y 0.613 Y 0.398 N gi|3298468|dbj|BAA31520.1| Q 0.066 24 N 0.139 N |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
28 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
29 In order to make it easier to use in Galaxy, this wrapper script reformats |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
30 this to use tab separators. Also it removes the redundant truncated name |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
31 column, and assigns unique column names in the header: |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
32 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
33 #ID NN_Cmax_score NN_Cmax_pos NN_Cmax_pred NN_Ymax_score NN_Ymax_pos NN_Ymax_pred NN_Smax_score NN_Smax_pos NN_Smax_pred NN_Smean_score NN_Smean_pred NN_D_score NN_D_pred HMM_bang HMM_Cmax_score HMM_Cmax_pos HMM_Cmax_pred HMM_Sprob_score HMM_Sprob_pred |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
34 gi|2781234|pdb|1JLY|B 0.061 17 N 0.043 17 N 0.199 1 N 0.067 N 0.055 N Q 0.000 17 N 0.000 N |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
35 gi|4959044|gb|AAD34209.1|AF069992_1 0.099 191 N 0.012 38 N 0.023 12 N 0.014 N 0.013 N Q 0.000 0 N 0.000 N |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
36 gi|671626|emb|CAA85685.1| 0.139 381 N 0.020 8 N 0.121 4 N 0.067 N 0.044 N Q 0.000 0 N 0.000 N |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
37 gi|3298468|dbj|BAA31520.1| 0.208 24 N 0.184 38 N 0.980 32 Y 0.613 Y 0.398 N Q 0.066 24 N 0.139 N |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
38 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
39 The second major feature is overcoming SignalP's built in limit of 4000 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
40 sequences by breaking up the input FASTA file into chunks. This also allows |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
41 us to pre-trim the sequences since SignalP only needs their starts. |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
42 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
43 The third major feature is taking advantage of multiple cores (since SignalP |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
44 v3.0 itself is single threaded) by using the individual FASTA input files to |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
45 run multiple copies of TMHMM in parallel. I would normally use Python's |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
46 multiprocessing library in this situation but it requires at least Python 2.6 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
47 and at the time of writing Galaxy still supports Python 2.4. |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
48 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
49 Note that this is somewhat redundant with job-splitting available in Galaxy |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
50 itself (see the SignalP XML file for settings). |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
51 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
52 Finally, you can opt to have a GFF3 file produced which will describe the |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
53 predicted signal peptide and mature peptide for each protein (using one of |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
54 the predictors which gives a cleavage site). *WORK IN PROGRESS* |
0
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
55 """ |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
56 import sys |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
57 import os |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
58 import tempfile |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
59 from seq_analysis_utils import stop_err, split_fasta, run_jobs, fasta_iterator |
0
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
60 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
61 FASTA_CHUNK = 500 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
62 MAX_LEN = 6000 #Found by trial and error |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
63 |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
64 if len(sys.argv) not in [6,8]: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
65 stop_err("Require five (or 7) arguments, organism, truncate, threads, " |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
66 "input protein FASTA file & output tabular file (plus " |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
67 "optionally cut method and GFF3 output file). " |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
68 "Got %i arguments." % (len(sys.argv)-1)) |
0
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
69 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
70 organism = sys.argv[1] |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
71 if organism not in ["euk", "gram+", "gram-"]: |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
72 stop_err("Organism argument %s is not one of euk, gram+ or gram-" % organism) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
73 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
74 try: |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
75 truncate = int(sys.argv[2]) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
76 except: |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
77 truncate = 0 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
78 if truncate < 0: |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
79 stop_err("Truncate argument %s is not a positive integer (or zero)" % sys.argv[2]) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
80 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
81 try: |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
82 num_threads = int(sys.argv[3]) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
83 except: |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
84 num_threads = 1 #Default, e.g. used "$NSLOTS" and environment variable not defined |
0
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
85 if num_threads < 1: |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
86 stop_err("Threads argument %s is not a positive integer" % sys.argv[3]) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
87 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
88 fasta_file = sys.argv[4] |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
89 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
90 tabular_file = sys.argv[5] |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
91 |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
92 if len(sys.argv) == 8: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
93 cut_method = sys.argv[6] |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
94 if cut_method not in ["NN_Cmax", "NN_Ymax", "NN_Smax", "HMM_Cmax"]: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
95 stop_err("Invalid cut method %r" % cut_method) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
96 gff3_file = sys.argv[7] |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
97 else: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
98 cut_method = None |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
99 gff3_file = None |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
100 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
101 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
102 tmp_dir = tempfile.mkdtemp() |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
103 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
104 def clean_tabular(raw_handle, out_handle, gff_handle=None, cut_method=None): |
0
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
105 """Clean up SignalP output to make it tabular.""" |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
106 if cut_method: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
107 cut_col = {"NN_Cmax" : 2, |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
108 "NN_Ymax" : 5, |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
109 "NN_Smax" : 8, |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
110 "HMM_Cmax" : 16}[cut_method] |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
111 else: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
112 cut_col = None |
0
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
113 for line in raw_handle: |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
114 if not line or line.startswith("#"): |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
115 continue |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
116 parts = line.rstrip("\r\n").split() |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
117 assert len(parts)==21, repr(line) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
118 assert parts[14].startswith(parts[0]) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
119 #Remove redundant truncated name column (col 0) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
120 #and put full name at start (col 14) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
121 parts = parts[14:15] + parts[1:14] + parts[15:] |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
122 out_handle.write("\t".join(parts) + "\n") |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
123 |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
124 def make_gff(fasta_file, tabular_file, gff_file, cut_method): |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
125 cut_col, score_col = {"NN_Cmax" : (2,1), |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
126 "NN_Ymax" : (5,4), |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
127 "NN_Smax" : (8,7), |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
128 "HMM_Cmax" : (16,15), |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
129 }[cut_method] |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
130 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
131 source = "SignalP" |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
132 strand = "." #not stranded |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
133 phase = "." #not phased |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
134 tags = "Note=%s" % cut_method |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
135 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
136 tab_handle = open(tabular_file) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
137 line = tab_handle.readline() |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
138 assert line.startswith("#ID\t"), line |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
139 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
140 gff_handle = open(gff_file, "w") |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
141 gff_handle.write("##gff-version 3\n") |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
142 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
143 for (title, seq), line in zip(fasta_iterator(fasta_file), tab_handle): |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
144 parts = line.rstrip("\n").split("\t") |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
145 seqid = parts[0] |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
146 assert title.startswith(seqid), "%s vs %s" % (seqid, title) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
147 if len(seq)==0: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
148 #Is it possible to have a zero length reference in GFF3? |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
149 continue |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
150 cut = int(parts[cut_col]) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
151 if cut == 0: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
152 assert cut_method == "HMM_Cmax", cut_method |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
153 #TODO - Why does it do this? |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
154 cut = 1 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
155 assert 1 <= cut <= len(seq), "%i for %s len %i" % (cut, seqid, len(seq)) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
156 score = parts[score_col] |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
157 gff_handle.write("##sequence-region %s %i %i\n" \ |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
158 % (seqid, 1, len(seq))) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
159 #If the cut is at the very begining, there is no signal peptide! |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
160 if cut > 1: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
161 #signal_peptide = SO:0000418 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
162 gff_handle.write("%s\t%s\t%s\t%i\t%i\t%s\t%s\t%s\t%s\n" \ |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
163 % (seqid, source, |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
164 "signal_peptide", 1, cut-1, |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
165 score, strand, phase, tags)) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
166 #mature_protein_region = SO:0000419 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
167 gff_handle.write("%s\t%s\t%s\t%i\t%i\t%s\t%s\t%s\t%s\n" \ |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
168 % (seqid, source, |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
169 "mature_protein_region", cut, len(seq), |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
170 score, strand, phase, tags)) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
171 tab_handle.close() |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
172 gff_handle.close() |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
173 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
174 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
175 fasta_files = split_fasta(fasta_file, os.path.join(tmp_dir, "signalp"), |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
176 n=FASTA_CHUNK, truncate=truncate, max_len=MAX_LEN) |
0
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
177 temp_files = [f+".out" for f in fasta_files] |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
178 assert len(fasta_files) == len(temp_files) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
179 jobs = ["signalp -short -t %s %s > %s" % (organism, fasta, temp) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
180 for (fasta, temp) in zip(fasta_files, temp_files)] |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
181 assert len(fasta_files) == len(temp_files) == len(jobs) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
182 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
183 def clean_up(file_list): |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
184 for f in file_list: |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
185 if os.path.isfile(f): |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
186 os.remove(f) |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
187 try: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
188 os.rmdir(tmp_dir) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
189 except: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
190 pass |
0
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
191 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
192 if len(jobs) > 1 and num_threads > 1: |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
193 #A small "info" message for Galaxy to show the user. |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
194 print "Using %i threads for %i tasks" % (min(num_threads, len(jobs)), len(jobs)) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
195 results = run_jobs(jobs, num_threads) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
196 assert len(fasta_files) == len(temp_files) == len(jobs) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
197 for fasta, temp, cmd in zip(fasta_files, temp_files, jobs): |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
198 error_level = results[cmd] |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
199 try: |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
200 output = open(temp).readline() |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
201 except IOError: |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
202 output = "(no output)" |
0
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
203 if error_level or output.lower().startswith("error running"): |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
204 clean_up(fasta_files + temp_files) |
0
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
205 stop_err("One or more tasks failed, e.g. %i from %r gave:\n%s" % (error_level, cmd, output), |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
206 error_level) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
207 del results |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
208 |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
209 out_handle = open(tabular_file, "w") |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
210 fields = ["ID"] |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
211 #NN results: |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
212 for name in ["Cmax", "Ymax", "Smax"]: |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
213 fields.extend(["NN_%s_score"%name, "NN_%s_pos"%name, "NN_%s_pred"%name]) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
214 fields.extend(["NN_Smean_score", "NN_Smean_pred", "NN_D_score", "NN_D_pred"]) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
215 #HMM results: |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
216 fields.extend(["HMM_type", "HMM_Cmax_score", "HMM_Cmax_pos", "HMM_Cmax_pred", |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
217 "HMM_Sprob_score", "HMM_Sprob_pred"]) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
218 out_handle.write("#" + "\t".join(fields) + "\n") |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
219 for temp in temp_files: |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
220 data_handle = open(temp) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
221 clean_tabular(data_handle, out_handle) |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
222 data_handle.close() |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
223 out_handle.close() |
bca9bc7fdaef
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff
changeset
|
224 |
7
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
225 #GFF3: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
226 if cut_method: |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
227 make_gff(fasta_file, tabular_file, gff3_file, cut_method) |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
228 |
9b45a8743100
Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents:
5
diff
changeset
|
229 clean_up(fasta_files + temp_files) |