annotate protein_prophet.xml @ 4:a67a5d30bb80

Uploaded
author iracooke
date Mon, 04 Mar 2013 19:04:16 -0500
parents 25261529840c
children 97f1c89cd831
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
2
25261529840c Uploaded
iracooke
parents:
diff changeset
1 <tool id="proteomics_search_protein_prophet_1" name="Protein Prophet" version="1.0.0">
25261529840c Uploaded
iracooke
parents:
diff changeset
2 <requirements>
25261529840c Uploaded
iracooke
parents:
diff changeset
3 <requirement type="package" version="1.1.9">galaxy_protk</requirement>
25261529840c Uploaded
iracooke
parents:
diff changeset
4 <requirement type="package" version="4.6.1">trans_proteomic_pipeline</requirement>
25261529840c Uploaded
iracooke
parents:
diff changeset
5 </requirements>
25261529840c Uploaded
iracooke
parents:
diff changeset
6
25261529840c Uploaded
iracooke
parents:
diff changeset
7 <description>Calculate Protein Prophet statistics on search results</description>
25261529840c Uploaded
iracooke
parents:
diff changeset
8
25261529840c Uploaded
iracooke
parents:
diff changeset
9
25261529840c Uploaded
iracooke
parents:
diff changeset
10 <!-- Note .. the input file is assumed to be the first argument -->
25261529840c Uploaded
iracooke
parents:
diff changeset
11 <command>protein_prophet.rb --galaxy $input_file -r $iproph $nooccam $groupwts $normprotlen $logprobs $confem $allpeps $unmapped $instances $delude --minprob=$minprob --minindep=$minindep </command>
25261529840c Uploaded
iracooke
parents:
diff changeset
12 <inputs>
25261529840c Uploaded
iracooke
parents:
diff changeset
13
25261529840c Uploaded
iracooke
parents:
diff changeset
14 <param name="input_file" type="data" format="peptideprophet_pepxml,interprophet_pepxml" multiple="false" label="Peptide Prophet Results" help="These files will typically be outputs from peptide prophet or interprophet"/>
25261529840c Uploaded
iracooke
parents:
diff changeset
15
25261529840c Uploaded
iracooke
parents:
diff changeset
16
25261529840c Uploaded
iracooke
parents:
diff changeset
17 <param name="iproph" selected="true" type="boolean" label="Inputs are from iProphet" truevalue="--iprophet-input" falsevalue=""/>
25261529840c Uploaded
iracooke
parents:
diff changeset
18 <param name="nooccam" type="boolean" label="Don't apply Occam's razor" help="When selected no attempt will be made to derive the simplest protein list explaining observed peptides" truevalue="--no-occam" falsevalue=""/>
25261529840c Uploaded
iracooke
parents:
diff changeset
19 <param name="groupwts" type="boolean" label="Use group weights" help="Check peptide's total weight (rather than actual weight) in the Protein Group against the threshold" truevalue="--group-wts" falsevalue=""/>
25261529840c Uploaded
iracooke
parents:
diff changeset
20 <param name="normprotlen" type="boolean" label="Normalize NSP using Protein Length" truevalue="--norm-protlen" falsevalue=""/>
25261529840c Uploaded
iracooke
parents:
diff changeset
21 <param name="logprobs" type="boolean" label="Use the log of probability in the confidence calculations" truevalue="--log-prob" falsevalue=""/>
25261529840c Uploaded
iracooke
parents:
diff changeset
22 <param name="confem" type="boolean" label="Use the EM to compute probability given the confidenct" truevalue="--confem" falsevalue=""/>
25261529840c Uploaded
iracooke
parents:
diff changeset
23 <param name="allpeps" type="boolean" label="Consider all possible peptides in the database in the confidence model" truevalue="--allpeps" falsevalue=""/>
25261529840c Uploaded
iracooke
parents:
diff changeset
24 <param name="unmapped" type="boolean" label="Report results for unmapped proteins" truevalue="--unmapped" falsevalue=""/>
25261529840c Uploaded
iracooke
parents:
diff changeset
25 <param name="instances" type="boolean" label="Use Expected Number of Ion Instances to adjust the peptide probabilities prior to NSP adjustment" truevalue="--instances" falsevalue=""/>
25261529840c Uploaded
iracooke
parents:
diff changeset
26 <param name="delude" type="boolean" label="Do NOT use peptide degeneracy information when assessing proteins" truevalue="--delude" falsevalue=""/>
25261529840c Uploaded
iracooke
parents:
diff changeset
27
25261529840c Uploaded
iracooke
parents:
diff changeset
28 <param name="minprob" type="text" label="Minimum peptide prophet probability for peptides to be considered" value="0.05"/>
25261529840c Uploaded
iracooke
parents:
diff changeset
29 <param name="minindep" type="text" label="Minimum percentage of independent peptides required for a protein" value="0"/>
25261529840c Uploaded
iracooke
parents:
diff changeset
30
25261529840c Uploaded
iracooke
parents:
diff changeset
31 </inputs>
25261529840c Uploaded
iracooke
parents:
diff changeset
32 <outputs>
25261529840c Uploaded
iracooke
parents:
diff changeset
33 <data format="protxml" name="output" metadata_source="input_file" label="protein_prophet.${input_file.display_name}.protXML" from_work_dir="protein_prophet_results.prot.xml"/>
25261529840c Uploaded
iracooke
parents:
diff changeset
34 </outputs>
25261529840c Uploaded
iracooke
parents:
diff changeset
35
25261529840c Uploaded
iracooke
parents:
diff changeset
36
25261529840c Uploaded
iracooke
parents:
diff changeset
37 <!--NOPLOT: do not generate plot png file
25261529840c Uploaded
iracooke
parents:
diff changeset
38 NOOCCAM: non-conservative maximum protein list
25261529840c Uploaded
iracooke
parents:
diff changeset
39 GROUPWTS: check peptide's total weight in the Protein Group against the threshold (default: check peptide's actual weight against threshold)
25261529840c Uploaded
iracooke
parents:
diff changeset
40 NORMPROTLEN: Normalize NSP using Protein Length
25261529840c Uploaded
iracooke
parents:
diff changeset
41 LOGPROBS: Use the log of the probabilities in the Confidence calculations
25261529840c Uploaded
iracooke
parents:
diff changeset
42 CONFEM: Use the EM to compute probability given the confidence
25261529840c Uploaded
iracooke
parents:
diff changeset
43 ALLPEPS: Consider all possible peptides in the database in the confidence model
25261529840c Uploaded
iracooke
parents:
diff changeset
44 UNMAPPED: Report results for UNMAPPED proteins
25261529840c Uploaded
iracooke
parents:
diff changeset
45 INSTANCES: Use Expected Number of Ion Instances to adjust the peptide probabilities prior to NSP adjustment
25261529840c Uploaded
iracooke
parents:
diff changeset
46 DELUDE: do NOT use peptide degeneracy information when assessing proteins
25261529840c Uploaded
iracooke
parents:
diff changeset
47
25261529840c Uploaded
iracooke
parents:
diff changeset
48 MINPROB: peptideProphet probabilty threshold (default=0.05)
25261529840c Uploaded
iracooke
parents:
diff changeset
49 MININDEP: minimum percentage of independent peptides required for a protein (default=0)
25261529840c Uploaded
iracooke
parents:
diff changeset
50
25261529840c Uploaded
iracooke
parents:
diff changeset
51
25261529840c Uploaded
iracooke
parents:
diff changeset
52 -->
25261529840c Uploaded
iracooke
parents:
diff changeset
53
25261529840c Uploaded
iracooke
parents:
diff changeset
54 <help>
25261529840c Uploaded
iracooke
parents:
diff changeset
55
25261529840c Uploaded
iracooke
parents:
diff changeset
56 **What it does**
25261529840c Uploaded
iracooke
parents:
diff changeset
57
25261529840c Uploaded
iracooke
parents:
diff changeset
58 Given a set of peptide assignments from MS/MS spectra in the form of a pepXML file, this tool estimates probabilities at the protein level. As output, the tool produces a protXML file, which contains proteins along with the estimated probabilities that those proteins were present. Probabilities are estimated using a statistical model based on the number of peptides corresponding to that protein and the confidence that each of those peptides were assigned correctly. It takes account of the fact that peptides may correspond to more than one protein.
25261529840c Uploaded
iracooke
parents:
diff changeset
59
25261529840c Uploaded
iracooke
parents:
diff changeset
60 ----
25261529840c Uploaded
iracooke
parents:
diff changeset
61
25261529840c Uploaded
iracooke
parents:
diff changeset
62 **Citation**
25261529840c Uploaded
iracooke
parents:
diff changeset
63
25261529840c Uploaded
iracooke
parents:
diff changeset
64 If you use this tool please read and cite the paper describing the statistical model implemented by Protein Prophet
25261529840c Uploaded
iracooke
parents:
diff changeset
65
25261529840c Uploaded
iracooke
parents:
diff changeset
66 Nesvizhskii A., et al. “A Statistical Model for Identifying Proteins by Tandem Mass Spectrometry” *Anal. Chem.* 75, 4646-4658 (2003).
25261529840c Uploaded
iracooke
parents:
diff changeset
67
25261529840c Uploaded
iracooke
parents:
diff changeset
68
25261529840c Uploaded
iracooke
parents:
diff changeset
69 </help>
25261529840c Uploaded
iracooke
parents:
diff changeset
70
25261529840c Uploaded
iracooke
parents:
diff changeset
71 </tool>