annotate tools/protein_analysis/signalp3.xml @ 18:eb6ac44d4b8e draft

Suite v0.2.8, record Promoter 2 verion + misc internal updates
author peterjc
date Tue, 01 Sep 2015 09:56:36 -0400
parents e6cc27d182a8
children a19b3ded8f33
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
18
eb6ac44d4b8e Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents: 17
diff changeset
1 <tool id="signalp3" name="SignalP 3.0" version="0.0.15">
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
2 <description>Find signal peptides in protein sequences</description>
7
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 6
diff changeset
3 <!-- If job splitting is enabled, break up the query file into parts -->
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 6
diff changeset
4 <!-- Using 2000 chunks meaning 4 threads doing 500 each is ideal -->
9b45a8743100 Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
peterjc
parents: 6
diff changeset
5 <parallelism method="basic" split_inputs="fasta_file" split_mode="to_size" split_size="2000" merge_outputs="tabular_file"></parallelism>
18
eb6ac44d4b8e Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents: 17
diff changeset
6 <requirements>
eb6ac44d4b8e Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents: 17
diff changeset
7 <requirement type="binary">signalp</requirement>
eb6ac44d4b8e Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents: 17
diff changeset
8 <requirement type="package">signalp</requirement>
eb6ac44d4b8e Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents: 17
diff changeset
9 </requirements>
eb6ac44d4b8e Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents: 17
diff changeset
10 <stdio>
eb6ac44d4b8e Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents: 17
diff changeset
11 <!-- Anything other than zero is an error -->
eb6ac44d4b8e Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents: 17
diff changeset
12 <exit_code range="1:" />
eb6ac44d4b8e Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents: 17
diff changeset
13 <exit_code range=":-1" />
eb6ac44d4b8e Suite v0.2.8, record Promoter 2 verion + misc internal updates
peterjc
parents: 17
diff changeset
14 </stdio>
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
15 <command interpreter="python">
17
e6cc27d182a8 Uploaded v0.2.6, embedded citations and uses $GALAXY_SLOTS
peterjc
parents: 16
diff changeset
16 signalp3.py $organism $truncate "\$GALAXY_SLOTS" $fasta_file $tabular_file
16
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
17 ##If the environment variable isn't set, get "", and the python wrapper
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
18 ##defaults to four threads.
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
19 </command>
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
20 <inputs>
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
21 <param name="fasta_file" type="data" format="fasta" label="FASTA file of protein sequences"/>
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
22 <param name="organism" type="select" display="radio" label="Organism">
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
23 <option value="euk">Eukaryote</option>
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
24 <option value="gram+">Gram positive</option>
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
25 <option value="gram-">Gram negative</option>
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
26 </param>
4
81caef04ce8b Migrated tool version 0.0.7 from old tool shed archive to new tool shed repository
peterjc
parents: 3
diff changeset
27 <param name="truncate" type="integer" label="Truncate sequences to this many amino acids" value="70" help="Use zero for no truncation, maximum value 6000">
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
28 <validator type="in_range" min="0" max="6000" message="Truncation value should be at most 6000. Use zero for no truncation."/>
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
29 </param>
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
30 </inputs>
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
31 <outputs>
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
32 <data name="tabular_file" format="tabular" label="SignalP $organism results" />
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
33 </outputs>
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
34 <tests>
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
35 <test>
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
36 <param name="fasta_file" value="four_human_proteins.fasta" ftype="fasta"/>
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
37 <param name="organism" value="euk"/>
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
38 <param name="truncate" value="0"/>
1
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
39 <output name="tabular_file" file="four_human_proteins.signalp3.tabular" ftype="tabular"/>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
40 </test>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
41 <test>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
42 <param name="fasta_file" value="empty.fasta" ftype="fasta"/>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
43 <param name="organism" value="euk"/>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
44 <param name="truncate" value="60"/>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
45 <output name="tabular_file" file="empty_signalp3.tabular" ftype="tabular"/>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
46 </test>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
47 <test>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
48 <param name="fasta_file" value="empty.fasta" ftype="fasta"/>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
49 <param name="organism" value="gram+"/>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
50 <param name="truncate" value="80"/>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
51 <output name="tabular_file" file="empty_signalp3.tabular" ftype="tabular"/>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
52 </test>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
53 <test>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
54 <param name="fasta_file" value="empty.fasta" ftype="fasta"/>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
55 <param name="organism" value="gram-"/>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
56 <param name="truncate" value="0"/>
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
57 <output name="tabular_file" file="empty_signalp3.tabular" ftype="tabular"/>
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
58 </test>
4
81caef04ce8b Migrated tool version 0.0.7 from old tool shed archive to new tool shed repository
peterjc
parents: 3
diff changeset
59 <test>
81caef04ce8b Migrated tool version 0.0.7 from old tool shed archive to new tool shed repository
peterjc
parents: 3
diff changeset
60 <param name="fasta_file" value="rxlr_win_et_al_2007.fasta" ftype="fasta"/>
81caef04ce8b Migrated tool version 0.0.7 from old tool shed archive to new tool shed repository
peterjc
parents: 3
diff changeset
61 <param name="organism" value="euk"/>
81caef04ce8b Migrated tool version 0.0.7 from old tool shed archive to new tool shed repository
peterjc
parents: 3
diff changeset
62 <param name="truncate" value="70"/>
81caef04ce8b Migrated tool version 0.0.7 from old tool shed archive to new tool shed repository
peterjc
parents: 3
diff changeset
63 <output name="tabular_file" file="rxlr_win_et_al_2007_sp3.tabular" ftype="tabular"/>
81caef04ce8b Migrated tool version 0.0.7 from old tool shed archive to new tool shed repository
peterjc
parents: 3
diff changeset
64 </test>
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
65 </tests>
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
66 <help>
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
67
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
68 **What it does**
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
69
1
3ff1dcbb9440 Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
peterjc
parents: 0
diff changeset
70 This calls the SignalP v3.0 tool for prediction of signal peptides, which uses both a Neural Network (NN) and Hidden Markov Model (HMM) to produce two sets of scores.
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
71
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
72 The input is a FASTA file of protein sequences, and the output is tabular with twenty columns (one row per protein):
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
73
11
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
74 ====== =================================================
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
75 Column Description
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
76 ------ -------------------------------------------------
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
77 1 Sequence identifier
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
78 2-14 Neural Network (NN) predictions (13 columns)
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
79 15-20 Hidden Markov Model (HMM) predictions (6 columns)
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
80 ====== =================================================
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
81
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
82 Internally the input FASTA file is divided into parts (to allow multiple processors to be used), and the proteins truncated as specified (see below). The raw output from SignalP is then reformatted into a tabular layout suitable for Galaxy (see below).
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
83
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
84 **Neural Network Scores**
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
85
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
86 For each organism class (Eukaryote, Gram-negative and Gram-positive), two different neural networks are used, one for predicting the actual signal peptide and one for predicting the position of the signal peptidase I (SPase I) cleavage site.
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
87
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
88 The NN output comprises three different scores (C-max, S-max and Y-max) and two scores derived from them (S-mean and D-score).
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
89
11
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
90 ====== ======= ===============================================================
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
91 Column Name Description
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
92 ------ ------- ---------------------------------------------------------------
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
93 2-4 C-score The C-score is the 'cleavage site' score. For each position in
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
94 the submitted sequence, a C-score is reported, which should
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
95 only be significantly high at the cleavage site. Confusion is
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
96 often seen with the position numbering of the cleavage site.
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
97 When a cleavage site position is referred to by a single number,
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
98 the number indicates the first residue in the mature protein,
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
99 meaning, that a predicted cleavage site between amino acid 26-27
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
100 is reported as 27, corresponding to the mature protein starting
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
101 at (and including) position 27.
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
102 ------ ------- ---------------------------------------------------------------
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
103 5-7 S-score The S-score for the signal peptide prediction is calculated for
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
104 every single amino acid position in the submitted sequence (not
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
105 shown in the output via Galaxy), with high scores indicating
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
106 that the corresponding amino acid is part of a signal peptide,
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
107 and low scores indicating that the amino acid is part of a
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
108 mature protein.
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
109 ------ ------- ---------------------------------------------------------------
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
110 8-10 Y-max Y-max is a derivative of the C-score combined with the S-score
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
111 resulting in a better cleavage site prediction than the raw
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
112 C-score alone. This is due to the fact that multiple high-peaking
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
113 C-scores can be found in one sequence, where only one is the
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
114 true cleavage site. The cleavage site is assigned from the
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
115 Y-score where the slope of the S-score is steep and a
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
116 significant C-score is found.
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
117 ------ ------- ---------------------------------------------------------------
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
118 11-12 S-mean The S-mean is the average of the S-score, ranging from the
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
119 N-terminal amino acid to the amino acid assigned with the
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
120 highest Y-max score, thus the S-mean score is calculated for
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
121 the length of the predicted signal peptide. The S-mean score
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
122 was in SignalP version 2.0 used as the criteria for
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
123 discrimination of secretory and non-secretory proteins.
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
124 ------ ------- ---------------------------------------------------------------
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
125 13-14 D-score The D-score was introduced in SignalP version 3.0 and is a
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
126 simple average of the S-mean and Y-max score. The score shows
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
127 superior discrimination performance of secretory and
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
128 non-secretory proteins to that of the S-mean score which was
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
129 used in SignalP version 1 and 2.
99b82a2b1272 Uploaded v0.2.0 which added PSORTb wrapper (written with Konrad Paszkiewicz)
peterjc
parents: 9
diff changeset
130 ====== ======= ===============================================================
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
131
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
132 For non-secretory proteins all the scores represented in the SignalP3-NN output should ideally be very low.
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
133
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
134 **Hidden Markov Model Scores**
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
135
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
136 The hidden Markov model calculates the probability of whether the submitted sequence contains a signal peptide or not. The eukaryotic HMM model also reports the probability of a signal anchor, previously named uncleaved signal peptides. Furthermore, the cleavage site is assigned by a probability score together with scores for the n-region, h-region, and c-region of the signal peptide, if such one is found.
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
137
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
138 The 'type' column uses 'S' for a signal peptide (i.e. secretory protein) and 'Q' for non-secretory protein.
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
139
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
140 **Notes**
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
141
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
142 The raw output 'short' output from TMHMM v2.0 looks something like this (21 columns space separated - shown here formatted nicely). Notice that the identifiers are given twice, the first time truncated (as part of the NN predictions) and the second time in full (in the HMM predictions).
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
143
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
144 ==================== ===== === = ===== === = ===== === = ===== = ===== = =================================== = ===== === = ===== =
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
145 # SignalP-NN euk predictions # SignalP-HMM euk predictions
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
146 ----------------------------------------------------------------------------- ------------------------------------------------------------
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
147 # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? D ? # name ! Cmax pos ? Sprob ?
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
148 gi|2781234|pdb|1JLY| 0.061 17 N 0.043 17 N 0.199 1 N 0.067 N 0.055 N gi|2781234|pdb|1JLY|B Q 0.000 17 N 0.000 N
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
149 gi|4959044|gb|AAD342 0.099 191 N 0.012 38 N 0.023 12 N 0.014 N 0.013 N gi|4959044|gb|AAD34209.1|AF069992_1 Q 0.000 0 N 0.000 N
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
150 gi|671626|emb|CAA856 0.139 381 N 0.020 8 N 0.121 4 N 0.067 N 0.044 N gi|671626|emb|CAA85685.1| Q 0.000 0 N 0.000 N
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
151 gi|3298468|dbj|BAA31 0.208 24 N 0.184 38 N 0.980 32 Y 0.613 Y 0.398 N gi|3298468|dbj|BAA31520.1| Q 0.066 24 N 0.139 N
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
152 ==================== ===== === = ===== === = ===== === = ===== = ===== = =================================== = ===== === = ===== =
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
153
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
154 In order to make this easier to use in Galaxy, the wrapper script simplifies this to remove the redundant column and use tabs for separation. It also includes a header line with unique column names.
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
155
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
156 =================================== ============= =========== ============ ============= =========== ============ ============= =========== ============ ============== ============= ========== ========= ======== ============== ============ ============= =============== ==============
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
157 #ID NN_Cmax_score NN_Cmax_pos NN_Cmax_pred NN_Ymax_score NN_Ymax_pos NN_Ymax_pred NN_Smax_score NN_Smax_pos NN_Smax_pred NN_Smean_score NN_Smean_pred NN_D_score NN_D_pred HMM_type HMM_Cmax_score HMM_Cmax_pos HMM_Cmax_pred HMM_Sprob_score HMM_Sprob_pred
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
158 gi|2781234|pdb|1JLY|B 0.061 17 N 0.043 17 N 0.199 1 N 0.067 N 0.055 N Q 0.000 17 N 0.000 N
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
159 gi|4959044|gb|AAD34209.1|AF069992_1 0.099 191 N 0.012 38 N 0.023 12 N 0.014 N 0.013 N Q 0.000 0 N 0.000 N
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
160 gi|671626|emb|CAA85685.1| 0.139 381 N 0.020 8 N 0.121 4 N 0.067 N 0.044 N Q 0.000 0 N 0.000 N
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
161 gi|3298468|dbj|BAA31520.1| 0.208 24 N 0.184 38 N 0.980 32 Y 0.613 Y 0.398 N Q 0.066 24 N 0.139 N
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
162 =================================== ============= =========== ============ ============= =========== ============ ============= =========== ============ ============== ============= ========== ========= ======== ============== ============ ============= =============== ==============
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
163
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
164 **Truncation**
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
165
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
166 Signal peptides are found at the start of a protein, so there is limited value in providing the full length sequence, and providing the full sequence slows down the analysis. Furthermore, SignalP has an upper bound on the sequence length it will accept (6000bp). Thus for practical reasons it is useful to truncate the proteins before passing them to SignalP. However, the precise point they are truncated does have a small influence on some score values, and thus to the results.
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
167
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
168 **References**
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
169
16
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
170 If you use this Galaxy tool in work leading to a scientific publication please
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
171 cite the following papers:
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
172
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
173 Peter J.A. Cock, Björn A. Grüning, Konrad Paszkiewicz and Leighton Pritchard (2013).
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
174 Galaxy tools and workflows for sequence analysis with applications
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
175 in molecular plant pathology. PeerJ 1:e167
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
176 http://dx.doi.org/10.7717/peerj.167
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
177
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
178 Bendtsen, Nielsen, von Heijne, and Brunak (2004).
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
179 Improved prediction of signal peptides: SignalP 3.0.
16
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
180 J. Mol. Biol., 340:783-795.
6
a290c6d4e658 Migrated tool version 0.0.9 from old tool shed archive to new tool shed repository
peterjc
parents: 4
diff changeset
181 http://dx.doi.org/10.1016/j.jmb.2004.05.028
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
182
16
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
183 Nielsen, Engelbrecht, Brunak and von Heijne (1997).
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
184 Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites.
16
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
185 Protein Engineering, 10:1-6.
6
a290c6d4e658 Migrated tool version 0.0.9 from old tool shed archive to new tool shed repository
peterjc
parents: 4
diff changeset
186 http://dx.doi.org/10.1093/protein/10.1.1
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
187
16
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
188 Nielsen and Krogh (1998).
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
189 Prediction of signal peptides and signal anchors by a hidden Markov model.
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
190 Proceedings of the Sixth International Conference on Intelligent Systems for Molecular Biology (ISMB 6),
16
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
191 AAAI Press, Menlo Park, California, pp. 122-130.
6
a290c6d4e658 Migrated tool version 0.0.9 from old tool shed archive to new tool shed repository
peterjc
parents: 4
diff changeset
192 http://www.ncbi.nlm.nih.gov/pubmed/9783217
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
193
16
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
194 See also http://www.cbs.dtu.dk/services/SignalP-3.0/output.php
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
195
16
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
196 This wrapper is available to install into other Galaxy Instances via the Galaxy
7de64c8b258d Uploaded v0.2.5, MIT licence, RST for README, citation information, development moved to GitHub
peterjc
parents: 11
diff changeset
197 Tool Shed at http://toolshed.g2.bx.psu.edu/view/peterjc/tmhmm_and_signalp
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
198 </help>
17
e6cc27d182a8 Uploaded v0.2.6, embedded citations and uses $GALAXY_SLOTS
peterjc
parents: 16
diff changeset
199 <citations>
e6cc27d182a8 Uploaded v0.2.6, embedded citations and uses $GALAXY_SLOTS
peterjc
parents: 16
diff changeset
200 <citation type="doi">10.7717/peerj.167</citation>
e6cc27d182a8 Uploaded v0.2.6, embedded citations and uses $GALAXY_SLOTS
peterjc
parents: 16
diff changeset
201 <citation type="doi">10.1016/j.jmb.2004.05.028</citation>
e6cc27d182a8 Uploaded v0.2.6, embedded citations and uses $GALAXY_SLOTS
peterjc
parents: 16
diff changeset
202 <citation type="doi">10.1093/protein/10.1.1</citation>
e6cc27d182a8 Uploaded v0.2.6, embedded citations and uses $GALAXY_SLOTS
peterjc
parents: 16
diff changeset
203 <!-- TODO - Add bibtex entry for PMID: 9783217 -->
e6cc27d182a8 Uploaded v0.2.6, embedded citations and uses $GALAXY_SLOTS
peterjc
parents: 16
diff changeset
204 </citations>
0
bca9bc7fdaef Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
peterjc
parents:
diff changeset
205 </tool>