changeset 4:81caef04ce8b

Migrated tool version 0.0.7 from old tool shed archive to new tool shed repository
author peterjc
date Tue, 07 Jun 2011 18:05:50 -0400
parents f3b373a41f81
children 0f1c61998b22
files tools/protein_analysis/README tools/protein_analysis/seq_analysis_utils.py tools/protein_analysis/signalp3.xml tools/protein_analysis/suite_config.xml
diffstat 3 files changed, 15 insertions(+), 6 deletions(-) [+]
line wrap: on
line diff
--- a/tools/protein_analysis/README	Tue Jun 07 18:05:13 2011 -0400
+++ b/tools/protein_analysis/README	Tue Jun 07 18:05:50 2011 -0400
@@ -73,6 +73,9 @@
 v0.0.4 - Ignore comment lines in tmhmm2 output.
 v0.0.5 - Explicitly request tmhmm short output (may not be the default)
 v0.0.6 - Improvement to how sub-jobs are run (should be faster)
+v0.0.7 - Change SignalP default truncation from 60 to 70 to match the
+         SignalP webservice.
+
 
 Developers
 ==========
--- a/tools/protein_analysis/signalp3.xml	Tue Jun 07 18:05:13 2011 -0400
+++ b/tools/protein_analysis/signalp3.xml	Tue Jun 07 18:05:50 2011 -0400
@@ -1,4 +1,4 @@
-<tool id="signalp3" name="SignalP 3.0" version="0.0.6">
+<tool id="signalp3" name="SignalP 3.0" version="0.0.7">
     <description>Find signal peptides in protein sequences</description>
     <command interpreter="python">
       signalp3.py $organism $truncate 8 $fasta_file $tabular_file
@@ -11,7 +11,7 @@
             <option value="gram+">Gram positive</option>
             <option value="gram-">Gram negative</option>
         </param>
-        <param name="truncate" type="integer" label="Truncate sequences to this many amino acids" value="60" help="Use zero for no truncation, maximum value 6000">
+        <param name="truncate" type="integer" label="Truncate sequences to this many amino acids" value="70" help="Use zero for no truncation, maximum value 6000">
             <validator type="in_range" min="0" max="6000" message="Truncation value should be at most 6000. Use zero for no truncation."/>
         </param>
     </inputs>
@@ -46,6 +46,12 @@
             <param name="truncate" value="0"/> 
             <output name="tabular_file" file="empty_signalp3.tabular" ftype="tabular"/>
         </test>
+        <test>
+            <param name="fasta_file" value="rxlr_win_et_al_2007.fasta" ftype="fasta"/>
+            <param name="organism" value="euk"/>
+            <param name="truncate" value="70"/> 
+            <output name="tabular_file" file="rxlr_win_et_al_2007_sp3.tabular" ftype="tabular"/>
+        </test>
     </tests>
     <help>
     
@@ -67,9 +73,9 @@
 
 The NN output comprises three different scores (C-max, S-max and Y-max) and two scores derived from them (S-mean and D-score).
 
-The C-score is the 'cleavage site' score. For each position in the submitted sequence, a C-score is reported, which should only be significantly high at the cleavage site. Confusion is often seen with the position numbering of the cleavage site. When a cleavage site position is referred to by a single number, the number indicates the first residue in the mature protein, meaning that a reported cleavage site between amino acid 26-27 corresponds to that the mature protein starts at (and include) position 27.
+The C-score is the 'cleavage site' score. For each position in the submitted sequence, a C-score is reported, which should only be significantly high at the cleavage site. Confusion is often seen with the position numbering of the cleavage site. When a cleavage site position is referred to by a single number, the number indicates the first residue in the mature protein, meaning that a predicted cleavage site between amino acid 26-27 is reported as 27, corresponding to the mature protein starting at (and including) position 27.
 
-The S-score for the signal peptide prediction is calculateded for every single amino acid position in the submitted sequence (not shown in the output via Galaxy), with high scores indicating that the corresponding amino acid is part of a signal peptide, and low scores indicating that the amino acid is part of a mature protein.
+The S-score for the signal peptide prediction is calculated for every single amino acid position in the submitted sequence (not shown in the output via Galaxy), with high scores indicating that the corresponding amino acid is part of a signal peptide, and low scores indicating that the amino acid is part of a mature protein.
 
 Y-max is a derivative of the C-score combined with the S-score resulting in a better cleavage site prediction than the raw C-score alone. This is due to the fact that multiple high-peaking C-scores can be found in one sequence, where only one is the true cleavage site. The cleavage site is assigned from the Y-score where the slope of the S-score is steep and a significant C-score is found.
 
--- a/tools/protein_analysis/suite_config.xml	Tue Jun 07 18:05:13 2011 -0400
+++ b/tools/protein_analysis/suite_config.xml	Tue Jun 07 18:05:50 2011 -0400
@@ -1,9 +1,9 @@
-    <suite id="tmhmm_and_signalp" name="TMHMM and SignalP" version="0.0.6">
+    <suite id="tmhmm_and_signalp" name="TMHMM and SignalP" version="0.0.7">
         <description>Wrappers for TMHMM and SignalP</description>
         <tool id="tmhmm2" name="TMHMM 2.0" version="0.0.6">
             <description>Find transmembrane domains in protein sequences</description>
         </tool>
-        <tool id="signalp3" name="SignalP 3.0" version="0.0.6">
+        <tool id="signalp3" name="SignalP 3.0" version="0.0.7">
             <description>Find signal peptides in protein sequences</description>
         </tool>
     </suite>