comparison tools/protein_analysis/signalp3.xml @ 4:81caef04ce8b

Migrated tool version 0.0.7 from old tool shed archive to new tool shed repository
author peterjc
date Tue, 07 Jun 2011 18:05:50 -0400
parents f3b373a41f81
children a290c6d4e658
comparison
equal deleted inserted replaced
3:f3b373a41f81 4:81caef04ce8b
1 <tool id="signalp3" name="SignalP 3.0" version="0.0.6"> 1 <tool id="signalp3" name="SignalP 3.0" version="0.0.7">
2 <description>Find signal peptides in protein sequences</description> 2 <description>Find signal peptides in protein sequences</description>
3 <command interpreter="python"> 3 <command interpreter="python">
4 signalp3.py $organism $truncate 8 $fasta_file $tabular_file 4 signalp3.py $organism $truncate 8 $fasta_file $tabular_file
5 ##I want the number of threads to be a Galaxy config option... 5 ##I want the number of threads to be a Galaxy config option...
6 </command> 6 </command>
9 <param name="organism" type="select" display="radio" label="Organism"> 9 <param name="organism" type="select" display="radio" label="Organism">
10 <option value="euk">Eukaryote</option> 10 <option value="euk">Eukaryote</option>
11 <option value="gram+">Gram positive</option> 11 <option value="gram+">Gram positive</option>
12 <option value="gram-">Gram negative</option> 12 <option value="gram-">Gram negative</option>
13 </param> 13 </param>
14 <param name="truncate" type="integer" label="Truncate sequences to this many amino acids" value="60" help="Use zero for no truncation, maximum value 6000"> 14 <param name="truncate" type="integer" label="Truncate sequences to this many amino acids" value="70" help="Use zero for no truncation, maximum value 6000">
15 <validator type="in_range" min="0" max="6000" message="Truncation value should be at most 6000. Use zero for no truncation."/> 15 <validator type="in_range" min="0" max="6000" message="Truncation value should be at most 6000. Use zero for no truncation."/>
16 </param> 16 </param>
17 </inputs> 17 </inputs>
18 <outputs> 18 <outputs>
19 <data name="tabular_file" format="tabular" label="SignalP $organism results" /> 19 <data name="tabular_file" format="tabular" label="SignalP $organism results" />
44 <param name="fasta_file" value="empty.fasta" ftype="fasta"/> 44 <param name="fasta_file" value="empty.fasta" ftype="fasta"/>
45 <param name="organism" value="gram-"/> 45 <param name="organism" value="gram-"/>
46 <param name="truncate" value="0"/> 46 <param name="truncate" value="0"/>
47 <output name="tabular_file" file="empty_signalp3.tabular" ftype="tabular"/> 47 <output name="tabular_file" file="empty_signalp3.tabular" ftype="tabular"/>
48 </test> 48 </test>
49 <test>
50 <param name="fasta_file" value="rxlr_win_et_al_2007.fasta" ftype="fasta"/>
51 <param name="organism" value="euk"/>
52 <param name="truncate" value="70"/>
53 <output name="tabular_file" file="rxlr_win_et_al_2007_sp3.tabular" ftype="tabular"/>
54 </test>
49 </tests> 55 </tests>
50 <help> 56 <help>
51 57
52 **What it does** 58 **What it does**
53 59
65 71
66 For each organism class (Eukaryote, Gram-negative and Gram-positive), two different neural networks are used, one for predicting the actual signal peptide and one for predicting the position of the signal peptidase I (SPase I) cleavage site. 72 For each organism class (Eukaryote, Gram-negative and Gram-positive), two different neural networks are used, one for predicting the actual signal peptide and one for predicting the position of the signal peptidase I (SPase I) cleavage site.
67 73
68 The NN output comprises three different scores (C-max, S-max and Y-max) and two scores derived from them (S-mean and D-score). 74 The NN output comprises three different scores (C-max, S-max and Y-max) and two scores derived from them (S-mean and D-score).
69 75
70 The C-score is the 'cleavage site' score. For each position in the submitted sequence, a C-score is reported, which should only be significantly high at the cleavage site. Confusion is often seen with the position numbering of the cleavage site. When a cleavage site position is referred to by a single number, the number indicates the first residue in the mature protein, meaning that a reported cleavage site between amino acid 26-27 corresponds to that the mature protein starts at (and include) position 27. 76 The C-score is the 'cleavage site' score. For each position in the submitted sequence, a C-score is reported, which should only be significantly high at the cleavage site. Confusion is often seen with the position numbering of the cleavage site. When a cleavage site position is referred to by a single number, the number indicates the first residue in the mature protein, meaning that a predicted cleavage site between amino acid 26-27 is reported as 27, corresponding to the mature protein starting at (and including) position 27.
71 77
72 The S-score for the signal peptide prediction is calculateded for every single amino acid position in the submitted sequence (not shown in the output via Galaxy), with high scores indicating that the corresponding amino acid is part of a signal peptide, and low scores indicating that the amino acid is part of a mature protein. 78 The S-score for the signal peptide prediction is calculated for every single amino acid position in the submitted sequence (not shown in the output via Galaxy), with high scores indicating that the corresponding amino acid is part of a signal peptide, and low scores indicating that the amino acid is part of a mature protein.
73 79
74 Y-max is a derivative of the C-score combined with the S-score resulting in a better cleavage site prediction than the raw C-score alone. This is due to the fact that multiple high-peaking C-scores can be found in one sequence, where only one is the true cleavage site. The cleavage site is assigned from the Y-score where the slope of the S-score is steep and a significant C-score is found. 80 Y-max is a derivative of the C-score combined with the S-score resulting in a better cleavage site prediction than the raw C-score alone. This is due to the fact that multiple high-peaking C-scores can be found in one sequence, where only one is the true cleavage site. The cleavage site is assigned from the Y-score where the slope of the S-score is steep and a significant C-score is found.
75 81
76 The S-mean is the average of the S-score, ranging from the N-terminal amino acid to the amino acid assigned with the highest Y-max score, thus the S-mean score is calculated for the length of the predicted signal peptide. The S-mean score was in SignalP version 2.0 used as the criteria for discrimination of secretory and non-secretory proteins. 82 The S-mean is the average of the S-score, ranging from the N-terminal amino acid to the amino acid assigned with the highest Y-max score, thus the S-mean score is calculated for the length of the predicted signal peptide. The S-mean score was in SignalP version 2.0 used as the criteria for discrimination of secretory and non-secretory proteins.
77 83