Next changeset 1:bcb001f200e8 (2014-01-08) |
Commit message:
Push to main toolshed |
added:
Csv2Apml.jar IsoFix.jar LICENSE MsFilt.jar NOTICE NapQ.jar PRIMS.jar ProgenesisConv.jar Quantifere.jar Quantiline.jar README.rst SedMat_cli.jar csv2apml.xml datatypes_conf.xml isofix.xml msfilt.xml napq.xml prims_proteomics_datatypes.py progenesisconverter.xml quantifere.xml quantiline.xml repository_dependencies.xml sedmat.xml static/images/msfilt_csv_out.png static/images/napq_overview.png static/images/quantifere_cyto_out.png |
b |
diff -r 000000000000 -r d50f079096ee Csv2Apml.jar |
b |
Binary file Csv2Apml.jar has changed |
b |
diff -r 000000000000 -r d50f079096ee IsoFix.jar |
b |
Binary file IsoFix.jar has changed |
b |
diff -r 000000000000 -r d50f079096ee LICENSE --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/LICENSE Wed Jan 08 11:39:16 2014 +0100 |
[ |
b'@@ -0,0 +1,202 @@\n+\n+ Apache License\n+ Version 2.0, January 2004\n+ http://www.apache.org/licenses/\n+\n+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n+\n+ 1. Definitions.\n+\n+ "License" shall mean the terms and conditions for use, reproduction,\n+ and distribution as defined by Sections 1 through 9 of this document.\n+\n+ "Licensor" shall mean the copyright owner or entity authorized by\n+ the copyright owner that is granting the License.\n+\n+ "Legal Entity" shall mean the union of the acting entity and all\n+ other entities that control, are controlled by, or are under common\n+ control with that entity. For the purposes of this definition,\n+ "control" means (i) the power, direct or indirect, to cause the\n+ direction or management of such entity, whether by contract or\n+ otherwise, or (ii) ownership of fifty percent (50%) or more of the\n+ outstanding shares, or (iii) beneficial ownership of such entity.\n+\n+ "You" (or "Your") shall mean an individual or Legal Entity\n+ exercising permissions granted by this License.\n+\n+ "Source" form shall mean the preferred form for making modifications,\n+ including but not limited to software source code, documentation\n+ source, and configuration files.\n+\n+ "Object" form shall mean any form resulting from mechanical\n+ transformation or translation of a Source form, including but\n+ not limited to compiled object code, generated documentation,\n+ and conversions to other media types.\n+\n+ "Work" shall mean the work of authorship, whether in Source or\n+ Object form, made available under the License, as indicated by a\n+ copyright notice that is included in or attached to the work\n+ (an example is provided in the Appendix below).\n+\n+ "Derivative Works" shall mean any work, whether in Source or Object\n+ form, that is based on (or derived from) the Work and for which the\n+ editorial revisions, annotations, elaborations, or other modifications\n+ represent, as a whole, an original work of authorship. For the purposes\n+ of this License, Derivative Works shall not include works that remain\n+ separable from, or merely link (or bind by name) to the interfaces of,\n+ the Work and Derivative Works thereof.\n+\n+ "Contribution" shall mean any work of authorship, including\n+ the original version of the Work and any modifications or additions\n+ to that Work or Derivative Works thereof, that is intentionally\n+ submitted to Licensor for inclusion in the Work by the copyright owner\n+ or by an individual or Legal Entity authorized to submit on behalf of\n+ the copyright owner. For the purposes of this definition, "submitted"\n+ means any form of electronic, verbal, or written communication sent\n+ to the Licensor or its representatives, including but not limited to\n+ communication on electronic mailing lists, source code control systems,\n+ and issue tracking systems that are managed by, or on behalf of, the\n+ Licensor for the purpose of discussing and improving the Work, but\n+ excluding communication that is conspicuously marked or otherwise\n+ designated in writing by the copyright owner as "Not a Contribution."\n+\n+ "Contributor" shall mean Licensor and any individual or Legal Entity\n+ on behalf of whom a Contribution has been received by Licensor and\n+ subsequently incorporated within the Work.\n+\n+ 2. Grant of Copyright License. Subject to the terms and conditions of\n+ this License, each Contributor hereby grants to You a perpetual,\n+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n+ copyright license to reproduce, prepare Derivative Works of,\n+ publicly display, publicly perform, sublicense, and distribute the\n+ Work and such Derivative Works in Source or Obj'..b'r shall be under the terms and conditions of\n+ this License, without any additional terms or conditions.\n+ Notwithstanding the above, nothing herein shall supersede or modify\n+ the terms of any separate license agreement you may have executed\n+ with Licensor regarding such Contributions.\n+\n+ 6. Trademarks. This License does not grant permission to use the trade\n+ names, trademarks, service marks, or product names of the Licensor,\n+ except as required for reasonable and customary use in describing the\n+ origin of the Work and reproducing the content of the NOTICE file.\n+\n+ 7. Disclaimer of Warranty. Unless required by applicable law or\n+ agreed to in writing, Licensor provides the Work (and each\n+ Contributor provides its Contributions) on an "AS IS" BASIS,\n+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n+ implied, including, without limitation, any warranties or conditions\n+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n+ PARTICULAR PURPOSE. You are solely responsible for determining the\n+ appropriateness of using or redistributing the Work and assume any\n+ risks associated with Your exercise of permissions under this License.\n+\n+ 8. Limitation of Liability. In no event and under no legal theory,\n+ whether in tort (including negligence), contract, or otherwise,\n+ unless required by applicable law (such as deliberate and grossly\n+ negligent acts) or agreed to in writing, shall any Contributor be\n+ liable to You for damages, including any direct, indirect, special,\n+ incidental, or consequential damages of any character arising as a\n+ result of this License or out of the use or inability to use the\n+ Work (including but not limited to damages for loss of goodwill,\n+ work stoppage, computer failure or malfunction, or any and all\n+ other commercial damages or losses), even if such Contributor\n+ has been advised of the possibility of such damages.\n+\n+ 9. Accepting Warranty or Additional Liability. While redistributing\n+ the Work or Derivative Works thereof, You may choose to offer,\n+ and charge a fee for, acceptance of support, warranty, indemnity,\n+ or other liability obligations and/or rights consistent with this\n+ License. However, in accepting such obligations, You may act only\n+ on Your own behalf and on Your sole responsibility, not on behalf\n+ of any other Contributor, and only if You agree to indemnify,\n+ defend, and hold each Contributor harmless for any liability\n+ incurred by, or claims asserted against, such Contributor by reason\n+ of your accepting any such warranty or additional liability.\n+\n+ END OF TERMS AND CONDITIONS\n+\n+ APPENDIX: How to apply the Apache License to your work.\n+\n+ To apply the Apache License to your work, attach the following\n+ boilerplate notice, with the fields enclosed by brackets "[]"\n+ replaced with your own identifying information. (Don\'t include\n+ the brackets!) The text should be enclosed in the appropriate\n+ comment syntax for the file format. We also recommend that a\n+ file or class name and description of purpose be included on the\n+ same "printed page" as the copyright notice for easier\n+ identification within third-party archives.\n+\n+ Copyright [yyyy] [name of copyright owner]\n+\n+ Licensed under the Apache License, Version 2.0 (the "License");\n+ you may not use this file except in compliance with the License.\n+ You may obtain a copy of the License at\n+\n+ http://www.apache.org/licenses/LICENSE-2.0\n+\n+ Unless required by applicable law or agreed to in writing, software\n+ distributed under the License is distributed on an "AS IS" BASIS,\n+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n+ See the License for the specific language governing permissions and\n+ limitations under the License.\n' |
b |
diff -r 000000000000 -r d50f079096ee MsFilt.jar |
b |
Binary file MsFilt.jar has changed |
b |
diff -r 000000000000 -r d50f079096ee NOTICE --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/NOTICE Wed Jan 08 11:39:16 2014 +0100 |
b |
@@ -0,0 +1,13 @@ +PRIMS proteomics toolset & Galaxy wrappers +========================================== + +Tools and wrappers for the PRIMS proteomics toolset. +Suite of custom tools to enable data processing and +protein inference for labeled and label-free Mass Spectrometry proteomics data. +Can be used in combination with PRIMS MASSCOMB (prims_masscomb package). +Copyright 2010-2013 by Pieter Lukasse, Plant Research International (PRI), +Wageningen, The Netherlands. All rights reserved. See the license text below. + +Galaxy wrappers and installation are available from the Galaxy Tool Shed at: +http://toolshed.g2.bx.psu.edu/view/pieterlukasse/prims_proteomics + |
b |
diff -r 000000000000 -r d50f079096ee NapQ.jar |
b |
Binary file NapQ.jar has changed |
b |
diff -r 000000000000 -r d50f079096ee PRIMS.jar |
b |
Binary file PRIMS.jar has changed |
b |
diff -r 000000000000 -r d50f079096ee ProgenesisConv.jar |
b |
Binary file ProgenesisConv.jar has changed |
b |
diff -r 000000000000 -r d50f079096ee Quantifere.jar |
b |
Binary file Quantifere.jar has changed |
b |
diff -r 000000000000 -r d50f079096ee Quantiline.jar |
b |
Binary file Quantiline.jar has changed |
b |
diff -r 000000000000 -r d50f079096ee README.rst --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README.rst Wed Jan 08 11:39:16 2014 +0100 |
b |
@@ -0,0 +1,67 @@ +PRIMS-proteomics toolset & Galaxy wrappers +========================================== + +Proteomics module of Plant Research International's Mass Spectrometry (PRIMS) toolsuite. +This toolset consists of custom tools to enable data processing and +protein inference for labeled and label-free Mass Spectrometry proteomics data. + +Can be used in combination with PRIMS-MASSCOMB (prims_masscomb package) and +with PRIMV-visualization (primv_visualization package). + +Copyright 2010-2013 by Pieter Lukasse, Plant Research International (PRI), +Wageningen, The Netherlands. All rights reserved. See the license text below. + +Galaxy wrappers and installation are available from the Galaxy Tool Shed at: +http://toolshed.g2.bx.psu.edu/view/pieterlukasse/prims_proteomics + +History +======= + +============== ====================================================================== +Date Changes +-------------- ---------------------------------------------------------------------- +January 2014 * first release via Tool Shed +November 2013 * multiple tools used internally at PRI +end 2011 * first tool +============== ====================================================================== + +Tool Versioning +=============== + +PRIMS tools will have versions of the form X.Y.Z. Versions +differing only after the second decimal should be completely +compatible with each other. Breaking changes should result in an +increment of the number before and/or after the first decimal. All +tools of version less than 1.0.0 should be considered beta. + + +Bug Reports & other questions +============================= + +For the time being issues can be reported via the contact form at: +http://www.wageningenur.nl/en/Persons/PNJ-Pieter-Lukasse.htm + +Developers, Contributions & Collaborations +========================================== + +If you wish to join forces and collaborate on some of the +tools do not hesitate to contact Pieter Lukasse via the contact form above. + + +License (Apache, Version 2.0) +============================= + +Copyright 2013 Pieter Lukasse, Plant Research International (PRI). + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this software except in compliance with the License. +You may obtain a copy of the License at + +http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + \ No newline at end of file |
b |
diff -r 000000000000 -r d50f079096ee SedMat_cli.jar |
b |
Binary file SedMat_cli.jar has changed |
b |
diff -r 000000000000 -r d50f079096ee csv2apml.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/csv2apml.xml Wed Jan 08 11:39:16 2014 +0100 |
b |
@@ -0,0 +1,127 @@ +<tool name="Csv2Apml" id="csv2apml" version="1.0.2"> + <description>Converts MS/MS data in CSV format to APML format</description> + <!-- + For remote debugging start you listener on port 8000 and use the following as command interpreter: + java -jar -Xdebug -Xrunjdwp:transport=dt_socket,address=D0100564.wurnet.nl:8000 + ////////////////////////// + --> + <command interpreter="java -jar "> + Csv2Apml.jar + -peptideAndProteinMatchListCSV $peptideAndProteinMatchListCSV + -attributesMappingCSV $attributesMappingCSV + -apmlFile $apmlFile + </command> + + <inputs> + + <param name="peptideAndProteinMatchListCSV" type="data" + format="csv" label="MS/MS CSV file" + help="MS/MS CSV file containing peptide identifications and protein matches" /> + + <param name="mz" type="text" optional="false" size="30" + label="Column name for precursor m/z" /> + + <param name="rt" type="text" optional="false" size="30" + label="Column name for precursor rt" /> + + <param name="charge" type="text" optional="false" size="30" + label="Column name for precursor charge (z)" /> + + <param name="pepSequence" type="text" optional="false" size="30" + label="Column name for peptide sequence" /> + + <param name="ppidScore" type="text" optional="false" size="30" + label="Column name for peptide identification score" /> + + <param name="scoringSchemeName" type="text" optional="true" size="30" + label="(Optional) Column name containing scoring scheme name" /> + + <param name="statisticalMeasure" type="text" optional="true" size="30" + label="(Optional) Column name for reported statistical measure values" + help="(e.g. column containing p-values or e-values)" /> + + <param name="ppidTheoreticalMz" type="text" optional="true" size="30" + label="(Optional) Column name for peptide theoretical m/z" /> + + <param name="modifications" type="text" optional="true" size="30" + label="(Optional) Column name for reported modifications" /> + + <param name="proteinAccession" type="text" optional="false" size="30" + label="Column name for protein accession code" /> + + <param name="protSequenceLength" type="text" optional="true" size="30" + label="(Optional) Column name for protein sequence length" /> + + <param name="pepProtStart" type="text" optional="true" size="30" + label="(Optional) Column name for protein match location start" + help="Where peptide sequence starts in protein"/> + + <param name="pepProtEnd" type="text" optional="true" size="30" + label="(Optional) Column name for protein match location end" + help="Where peptide sequence ends in protein"/> + + <param name="sourceName" type="text" optional="true" size="30" + label="(Optional) Column name for sample names" /> + + </inputs> + <configfiles> + <configfile name="attributesMappingCSV">Generic name,name in S1 table CSV +mz,${mz} +rt,${rt} +charge,${charge} +pepSequence,${pepSequence} +ppidScore,${ppidScore} +proteinAccession,${proteinAccession} +#if $ppidTheoreticalMz != "None" +ppidTheoreticalMz,${ppidTheoreticalMz} +#end if +#if $modifications != "None" +modifications,${modifications} +#end if +#if $scoringSchemeName != "None" +scoringSchemeName,${scoringSchemeName} +#end if +#if $statisticalMeasure != "None" +statisticalMeasure,${statisticalMeasure} +#end if +#if $protSequenceLength != "None" +protSequenceLength,${protSequenceLength} +#end if +#if $pepProtStart != "None" +pepProtStart,${pepProtStart} +#end if +#if $pepProtEnd != "None" +pepProtEnd,${pepProtEnd} +#end if +#if $sourceName != "None" +sourceName,${sourceName} +#end if</configfile> + </configfiles> + + <outputs> + <data name="apmlFile" format="apml" label="${tool.name} on ${on_string}: APML" > + </data> + </outputs> + <tests> + </tests> + <help> + +.. class:: infomark + +This tool converts a CSV file containing MS/MS peptide identifications and their respective protein matches +to the APML xml format. +The identifications in APML format can be used for example to annotate unidentified MS features via SEDMAT(*). +This format is also compatible with what is expected by other post-processing tools like Quantifere (for +protein inference). + +(*)SEDMAT can use MS2 identification data +and couple it to this MS1 data, thereby annotating the MS1 feature list with identifications. + +----- + +**Output** + +This tools returns the input data in APML xml format. + + </help> +</tool> |
b |
diff -r 000000000000 -r d50f079096ee datatypes_conf.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/datatypes_conf.xml Wed Jan 08 11:39:16 2014 +0100 |
b |
@@ -0,0 +1,9 @@ +<?xml version="1.0"?> +<datatypes> + <datatype_files> + <datatype_file name="prims_proteomics_datatypes.py"/> + </datatype_files> + <registration display_path="display_applications"> + <datatype extension="apml" type="galaxy.datatypes.prims_proteomics_datatypes:Apml" display_in_upload="true" /> + </registration> +</datatypes> \ No newline at end of file |
b |
diff -r 000000000000 -r d50f079096ee isofix.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/isofix.xml Wed Jan 08 11:39:16 2014 +0100 |
b |
@@ -0,0 +1,66 @@ +<tool name="IsoFix" id="isofix1" version="0.0.1"> + <description>Identifies in-source decay peptides and corrects protein assignments</description> + <!-- + For remote debugging start you listener on port 8000 and use the following as command interpreter: + java -jar -Xdebug -Xrunjdwp:transport=dt_socket,address=D0100564.wurnet.nl:8000 + ////////////////////////// + --> + <command interpreter="java -jar "> + IsoFix.jar + -identificationsFile $identificationsFile + -outputFile $outputFile + -format apml + -rtTol $rtTol + -logFile $logFile + #if $useOriginalProteinSequences.useOriginalProteinSequencesFile == True + -fastaFile $useOriginalProteinSequences.fastaFile + #end if + </command> + + <inputs> + + <param name="identificationsFile" type="data" format="apml" label="MS/MS identifications file" /> + + <param name="rtTol" type="integer" size="10" value="15" label="Retention time tolerance (seconds) " /> + + <param name="createLogFile" type="boolean" checked="true" label="Generate log file" help="Lists the in-source decay peptides found"/> + + <conditional name="useOriginalProteinSequences"> + <param name="useOriginalProteinSequencesFile" type="boolean" + truevalue="Yes" falsevalue="No" checked="true" + label="Use original protein sequences for detecting peptide source relations" + help="This can reduce redundancy in final set by correctly identifying which peptides derive from bigger peptides that are also identified"/> + <when value="Yes"> + <param name="fastaFile" type="data" format="fasta" label="Protein sequences (fasta file)"/> + </when> + </conditional> + + </inputs> + <outputs> + <data name="outputFile" format="apml" label="${identificationsFile.metadata.base_name} - ${tool.name} on ${on_string}: APML" metadata_source="identificationsFile"></data> + <data name="logFile" format="txt" label="${tool.name} on ${on_string} - LOG file"> + <!-- If the expression is false, the file is not created --> + <filter>( createLogFile == True )</filter> + </data> + </outputs> + <tests> + </tests> + <help> + +.. class:: infomark + +This tool identifies in-source decay peptides and corrects protein assignments. + +----- + +**Output example** + +This tools returns the given input file but then with corrected protein assignments and +in-source decay peptides identified (by a small modification in their sequence string). +E.g. if peptide TYNSIMK is found to be an in-source decay of HETTYNSIMK, then +its sequence is changed to HET}TYNSIMK (so the decayed part + "}" + own sequence). +E.g. decay from both sides: YNSI, HETTYNSIMK = HET}TYNSI{MK + + + </help> +</tool> |
b |
diff -r 000000000000 -r d50f079096ee msfilt.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/msfilt.xml Wed Jan 08 11:39:16 2014 +0100 |
b |
b'@@ -0,0 +1,229 @@\n+<tool name="MsFilt" id="msfilt" version="1.0.2">\r\n+\t<description>Filters annotations based MS/MS peptide identification and annotation quality measures</description>\r\n+\t<!-- \r\n+\t For remote debugging start you listener on port 8000 and use the following as command interpreter:\r\n+\t java -jar -Xdebug -Xrunjdwp:transport=dt_socket,address=D0100564.wurnet.nl:8000 \r\n+\t //////////////////////////\r\n+\t -->\r\n+\t<command interpreter="java -jar ">\r\n+\t MsFilt.jar \r\n+\t -apmlFile $apmlFile\r\n+\t -datasetCode $apmlFile.metadata.base_name\r\n+\t -rankingMetadataFile $rankingMetadataFile\r\n+\t -statisticalMeasuresConfigFile $statisticalMeasuresConfigFile\r\n+\t -annotationSourceConfigFile $annotationSourceConfigFile\r\n+\t -outApml $outputApml\r\n+\t -outNewIdsApml $outNewIdsApml\r\n+\t -outFullCSV $outputCSV \r\n+\t -outRankingTable $outRankingTable\r\n+\t -outProteinCoverageCSV $outProteinCoverageCSV\r\n+\t -fpCriteriaExpression "$fpCriteriaExpression"\r\n+\t -filterOutFPAnnotations $filterOutFPAnnotations\r\n+\t -fpCriteriaExpressionForIds "$fpCriteriaExpressionForIds"\r\n+\t -filterOutFPIds $filterOutFPIds\r\n+\t -filterOutUnannotatedAlignments $filterOutUnannotatedAlignments\r\n+\t -addRawRankingInfo $addRawRankingInfo\r\n+\t -addScaledIntensityInfo $addScaledIntensityInfo\r\n+\t -addRawIntensityInfo $addRawIntensityInfo\r\n+ \t-outReport $htmlReportFile\r\n+\t -outReportPicturesPath $htmlReportFile.files_path\r\n+\t</command>\r\n+\t\r\n+\t<inputs>\r\n+\t \t\r\n+ \t\t<param name="apmlFile" type="data" format="apml" optional="true" \r\n+ \t\t label="(Optional) Peptide quantification file (APML)" \r\n+ \t\t help="The APML contents as aligned and annotated feature lists. E.g. produced by \r\n+ \t\t SEDMAT or Quantiline tools." />\r\n+ \t\t\r\n+ \t\t<repeat name="annotationSourceFiles" title="(Optional) Peptide identification files" help="Full set of MS/MS peptide identification files, including peptides that could not be quantified.">\r\n+ \t\t\t<param name="identificationsFile" type="data" format="apml,mzidentml,prims.fileset.zip" label="Identifications file (APML or MZIDENTML or MZIDENTML fileSet)" />\r\n+ \t\t\t<param name="spectraFile" type="data" format="mzidentml,prims.fileset.zip" optional="true" label="(Optional) Spectra fileSet (mzml file or fileSet)"\r\n+ \t\t\t\t help="Select this in case your Identifications file is MZIDENTML or MZIDENTML fileSet" />\r\n+ \t\t</repeat>\r\n+ \t\t\r\n+ \t<!-- \r\n+ \t<param name="maxNrRankings" type="integer" size="10" value="0" label="Maximum nr. of items to leave in the final ranking (set=0 for no limit) " />\r\n+ \t-->\r\n+ \t<!-- TODO add info somewhere that deltaRt is \'corrected deltaRt\' -->\r\n+\t\t<param name="rankingWeightConfig" type="text" area="true" size="13x70" label="Quality Measures (qm\'s) and ranking weights configuration" \r\n+\t\thelp="Here you may specify a weight for each of the Quality Measures (QMs). These are used for the final QM score and possibly for ranking (e.g. in case of label-free data\r\n+\t\tprocessed by SEDMAT). The format is: QM alias => QM name,weight. "\r\n+value="qmDRT => delta rt (standard score),1\r\n+
qmDMA => delta mass annotation (standard score),1\r\n+
qmDMP => delta mass psm (standard score),1\r\n+
qmBSCR => best peptide score (standard score),1\r\n+
qmALCV => alignment coverage (fraction),1\r\n+
qmSTCV => score type coverage (fraction),1\r\n+
qmPACV => peptide\'s best proteinAnnotCoverage (standard score),1\r\n+
qmPICV => peptide\'s best proteinIdentifCoverage (standard score),1\r\n+
qmANS => annotation sources (count),1\r\n+
qmCSEV => charge states evidence (count),0.2\r\n+
qmBCSP=> best correlation with source or product peptide (correl),1\r\n+
qmBCCS => best correlation with other charge state (correl),1\r\n+
qmBCOS => best correlation with other sibling peptide (correl),1\r\n+"/>\r\n+\r\n+\t\t<para'..b'er>\r\n+\t </data>\r\n+\t <data name="outputCSV" format="csv" label="${apmlFile.metadata.base_name} - ${tool.name} on ${on_string}: Full CSV" metadata_source="apmlFile">\r\n+\t \t<filter>( apmlFile != None )</filter>\r\n+\t </data>\r\n+\t <data name="outRankingTable" format="csv" label="${apmlFile.metadata.base_name} - ${tool.name} on ${on_string}: Ranking table (CSV)" metadata_source="apmlFile">\r\n+\t \t<filter>( apmlFile != None )</filter>\r\n+\t </data>\r\n+\t <data name="outProteinCoverageCSV" format="csv" label="${tool.name} on ${on_string}: Protein coverage details (CSV)">\r\n+\t \t<!-- If the expression is false, the file is not created -->\r\n+\t \t<filter>( len(list(enumerate(annotationSourceFiles))) > 0 )</filter>\r\n+\t </data>\r\n+\t <data name="htmlReportFile" format="html" label="${tool.name} on ${on_string} - HTML report"/>\r\n+\t</outputs>\r\n+\t<tests>\r\n+\t</tests>\r\n+ <help>\r\n+ \r\n+.. class:: infomark\r\n+ \r\n+This tool takes in peptide quantification results (e.g. either by SEDMAT for label-free data or by Quantiline for labeled data)\r\n+and calculates a number of quality measures that can help in assessing the correctness of the quantification assignment and of the MS/MS peptide \r\n+identification itself. The user can use any combination of quality measures (qm\'s) and statistical measures (sm\'s) to filter out\r\n+low scoring entries. \r\n+\r\n+.. class:: infomark\r\n+\r\n+In the label-free data processed by SEDMAT it is possible that a feature quantification gets assigned to different peptides. This means\r\n+we have an ambiguous assignment. In such a case\r\n+this tool also does a ranking of the different assignments according to their quality measures so that the best scoring assignment\r\n+gets ranked as first. \r\n+\r\n+-----\r\n+\r\n+**List of abbreviations**\r\n+\r\n+QM: Quality Measure\r\n+\r\n+SM: Statistical Measure (e.g. p-value, e-value from MS/MS identification) \r\n+\r\n+PSM: "Peptide to Spectrum Match" (aka peptide identification)\r\n+\r\n+FP: False Positive\r\n+\r\n+-----\r\n+\r\n+**Filtering options details**\r\n+\r\n+The FP criteria will be applied to an annotation even if the corresponding quality measures involved \r\n+in the expression can NOT ALL be determined. QMs that cannot be determined, get the value 0 (zero) which is \r\n+equal to giving it the average value. \r\n+\r\n+The output report shows some plots that visualize the filtering done. This can help in fine-tuning the right filtering\r\n+criteria.\r\n+\r\n+-----\r\n+\r\n+**Output details**\r\n+\r\n+*APML output*\r\n+\r\n+This tools returns the given APML alignment file further annotated at the alignment level with the best ranking \r\n+peptides of each respective alignment. This APML can be used in subsequent Galaxy tools like the proteomics tools\r\n+from NBIC. \r\n+\r\n+The APML output can also be used for the Protein Inference step (see Quantifere tool).\r\n+\r\n+*CSV output*\r\n+\r\n+It also returns a CSV format output with the full quality measures and scoring and ranking details. The user could use\r\n+this to manually determine new weights for some of the quality measures by techniques such as \r\n+linear regression. In other words, this CSV can then be used to fine-tune the weights in a next run. \r\n+\r\n+Many of the quality measures (QMs) are normalized to their Standard Score (aka z-score). \r\n+`See Standard Score for more details...`__ \r\n+\r\n+Next to giving insight into how the ranking was established, a more complete version of this CSV file is also\r\n+generated for tools that cannot or won\'t process the APML output format. \r\n+\r\n+Below an brief overview of the CSV and an illustration of the ranking done in case of ambiguous peptides to feature assignments\r\n+(explained above, can happen in case of label-free data processing by SEDMAT).\r\n+\r\n+\r\n+.. image:: $PATH_TO_IMAGES/msfilt_csv_out.png \r\n+\r\n+\r\n+\r\n+.. __: javascript:window.open(\'http://en.wikipedia.org/wiki/Standard_score\',\'popUpWindow\',\'height=700,width=800,left=10,top=10,resizable=yes,scrollbars=yes,toolbar=yes,menubar=no,location=no,directories=no,status=yes\')\r\n+\r\n+\r\n+\r\n+\r\n+ </help>\r\n+</tool>\r\n' |
b |
diff -r 000000000000 -r d50f079096ee napq.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/napq.xml Wed Jan 08 11:39:16 2014 +0100 |
b |
@@ -0,0 +1,93 @@ +<tool name="NapQ" id="napq" version="0.0.1"> + <description>'no alignment'(alignment-free) peptide quantification</description> + <!-- + For remote debugging start you listener on port 8000 and use the following as command interpreter: + java -jar -Xdebug -Xrunjdwp:transport=dt_socket,address=D0100564.wurnet.nl:8000 + ////////////////////////// + --> + <command interpreter="java -jar "> + NapQ.jar + -identificationsConfigFile $identificationsConfigFile + -namingConventionCodesForSamples $namingConventionCodesForSamples + #if $is2D_LC_MS.fractions == True + -namingConventionCodesForFractions $is2D_LC_MS.namingConventionCodesForFractions + #end if + -outputApml $outputApml + -outputTsv $outputTsv + -outReport $htmlReportFile + -outReportPicturesPath $htmlReportFile.files_path + </command> + + <inputs> + + <repeat name="identificationFileList" title="Peptide identification files" help="Full set of MS/MS peptide identification files, including peptides that could not be quantified."> + <param name="identificationsFile" type="data" format="apml,mzidentml,prims.fileset.zip" label="Identifications file (APML or MZIDENTML or MZIDENTML fileSet)" /> + <param name="spectraFile" type="data" format="mzidentml,prims.fileset.zip" optional="true" label="(Optional) Spectra fileSet (mzml file or fileSet)" + help="Select this in case your Identifications file is MZIDENTML or MZIDENTML fileSet" /> + </repeat> + + <param name="namingConventionCodesForSamples" type="text" size="100" value="" + label="Part of run/file name that identifies the sample" + help="Add the CSV list of codes that occur in the file names + and that stand for a sample code. E.g. '_S1,_S2,_S3,etc.' "/> <!-- could do regular expressions as well but this would be hard for biologists, e.g. _F\d\b --> + + + <conditional name="is2D_LC_MS"> + <param name="fractions" type="boolean" truevalue="Yes" falsevalue="No" checked="false" + label="Data is from 2D LC-MS" + help="Data acquisition was done in multiple fractions."/> + <when value="Yes"> + <param name="namingConventionCodesForFractions" type="text" size="100" value="" + label="Part of run/file name that identifies the 2D LC-MS fraction" + help="Add the CSV list of codes that occur in the file names + and that stand for a fraction code. E.g. '_F1,_F2,_F3,etc.' Use this to avoid + that each (fraction) file is seen as a separate run."/> <!-- could do regular expressions as well but this would be hard for biologists, e.g. _F\d\b --> + </when> + </conditional> + + </inputs> + <configfiles> + <configfile name="identificationsConfigFile">## start comment + ## iterate over the selected files and store their names in the config file + #for $i, $s in enumerate( $identificationFileList ) + ${s.identificationsFile}|${s.spectraFile} + ## also print out the datatype in the next line, based on previously configured datatype + #if isinstance( $s.identificationsFile.datatype, $__app__.datatypes_registry.get_datatype_by_extension('apml').__class__): + apml + #else: + mzid + #end if + #end for + ## end comment</configfile> + </configfiles> + <outputs> + <data name="outputApml" format="apml" label="${tool.name} on ${on_string}: peptide quantifications (APML)"/> + <data name="outputTsv" format="tabular" label="${tool.name} on ${on_string}: peptide quantifications (TSV)"/> + <!-- in tsv we can have cols like: pep, avg_m/z, avg rt, m/z window, rt window, i_s1, i_s2, ...--> + <data name="htmlReportFile" format="html" label="${tool.name} on ${on_string} - HTML report"/> + <!-- here we show the samples extracted and the files used to 'build up' each sample --> + </outputs> + <tests> + </tests> + <help> + +.. class:: infomark + +This tool takes in multiple peptide identification result files that have peptide identifications +coupled to some quantification (e.g. precursor intensity information or for example data coming +from MS^E acquisition where peptide identification and quantification are done in the same run and reported together). +Then, based on the given experiment design parameters (i.e. how the result files related back to +replicate runs and samples), it produces a new file in which the peptides are reported with +their calculated quantifications at the sample level. + +The figure below explains this: + +.. image:: $PATH_TO_IMAGES/napq_overview.png + + + + + + + </help> +</tool> |
b |
diff -r 000000000000 -r d50f079096ee prims_proteomics_datatypes.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/prims_proteomics_datatypes.py Wed Jan 08 11:39:16 2014 +0100 |
b |
@@ -0,0 +1,42 @@ +""" +PRIMS proteomics classes for types defined in datatypes_conf.xml +""" +import logging +import re +from galaxy.datatypes.data import * +from galaxy.datatypes.xml import * +from galaxy.datatypes.sniff import * +from galaxy.datatypes.binary import * +from galaxy.datatypes.interval import * + +log = logging.getLogger(__name__) + + +class ProteomicsXml(GenericXml): + """ An enhanced XML datatype used to reuse code across several + proteomic/mass-spec datatypes. (this part of the code is taken from protk proteomics datatypes package) """ + + def sniff(self, filename): + """ Determines whether the file is the correct XML type. """ + with open(filename, 'r') as contents: + while True: + line = contents.readline() + if line == None or not line.startswith('<?'): + break + pattern = '^<(\w*:)?%s' % self.root # pattern match <root or <ns:root for any ns string + return line != None and re.match(pattern, line) != None + + def set_peek( self, dataset, is_multi_byte=False ): + """Set the peek and blurb text""" + if not dataset.dataset.purged: + dataset.peek = data.get_file_peek( dataset.file_name, is_multi_byte=is_multi_byte ) + dataset.blurb = self.blurb + else: + dataset.peek = 'file does not exist' + dataset.blurb = 'file purged from disk' + +class Apml( ProteomicsXml ): + """APML data""" + file_ext = "apml" + blurb = 'PRIMS APML proteomics data' + root = "apml" \ No newline at end of file |
b |
diff -r 000000000000 -r d50f079096ee progenesisconverter.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/progenesisconverter.xml Wed Jan 08 11:39:16 2014 +0100 |
b |
@@ -0,0 +1,68 @@ +<tool name="ProgenesisConverter" id="progenesisconv1" version="1.0.2"> + <description>Converts Progenesis aligned feature lists in CSV format to APML</description> + <!-- + For remote debugging start you listener on port 8000 and use the following as command interpreter: + java -jar -Xdebug -Xrunjdwp:transport=dt_socket,address=D0100564.wurnet.nl:8000 + ////////////////////////// + --> + <command interpreter="java -jar "> + ProgenesisConv.jar + -progenesisFile $progenesisFile + -apmlFile $apmlFile + #if $multipleScoringSchemes.containsMultipleScoringSchemes == True + -scoringSchemeNameColumn $multipleScoringSchemes.scoringSchemeNameColumn + #end if + #if $statisticalMeasure.containsStatisticalMeasure == True + -statisticalMeasureColumn $statisticalMeasure.statisticalMeasureColumn + #end if + </command> + + <inputs> + + <param name="progenesisFile" type="data" format="csv" label="Progenesis aligned feature lists CSV file" /> + + <conditional name="multipleScoringSchemes"> + <param name="containsMultipleScoringSchemes" type="boolean" truevalue="Yes" falsevalue="No" checked="false" + label="Progenesis scores contain multiple scoring schemes" + help="Set this if the scores in the 'Score' column come from two or more different schemes (e.g. MSE and DDA)"/> + <when value="Yes"> + <param name="scoringSchemeNameColumn" type="text" optional="true" size="30" + label="Column name" + help="Name of the column containing the scoring scheme name" /> + </when> + </conditional> + + <conditional name="statisticalMeasure"> + <param name="containsStatisticalMeasure" type="boolean" truevalue="Yes" falsevalue="No" checked="false" + label="Input sheet contains a statistical measure column" + help="Set this if the the input sheet also contains a column with a statistical measure (e.g. p-value, e-value, etc)"/> + <when value="Yes"> + <param name="statisticalMeasureColumn" type="text" optional="true" size="30" + label="Column name" + help="Name of the column containing the statistical measure" /> + </when> + </conditional> + + </inputs> + <outputs> + <data name="apmlFile" format="apml" label="${progenesisFile.metadata.base_name} - ${tool.name} on ${on_string}: APML" metadata_source="progenesisFile"> + </data> + </outputs> + <tests> + </tests> + <help> + +.. class:: infomark + +This tool converts a Progenesis CSV file to the APML xml format. +This format can be used to submit the data for annotation by SEDMAT. SEDMAT can use MS2 identification data +and couple it to this MS1 data, thereby annotating the MS1 feature list with identifications. + +----- + +**Output example** + +This tools returns APML output that can be used as input for the SEDMAT tool. + + </help> +</tool> |
b |
diff -r 000000000000 -r d50f079096ee quantifere.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/quantifere.xml Wed Jan 08 11:39:16 2014 +0100 |
[ |
b'@@ -0,0 +1,206 @@\n+<tool name="Quantifere" id="quantifere1" version="1.0.2">\r\n+\t<description>Protein Inference by Peptide Quantification patterns</description>\r\n+\t<!-- \r\n+\t For remote debugging start you listener on port 8000 and use the following as command interpreter:\r\n+\t java -jar -Xdebug -Xrunjdwp:transport=dt_socket,address=D0100564.wurnet.nl:8000 \r\n+\t //////////////////////////\r\n+\t -->\r\n+\t<command interpreter="java -jar ">\r\n+\t Quantifere.jar \r\n+\t -annotatedQuantificationFilesList $annotatedQuantificationFilesList\r\n+\t -identificationFilesList $identificationFilesList\r\n+ \t-statisticalMeasuresConfigFile $statisticalMeasuresConfigFile\r\n+\t -quantificationDataToUse $quantificationDataToUse\r\n+\t -minCorrel $minCorrel\r\n+\t -minProtCoverage $minProtCoverage\r\n+\t -minAboveAverageHits $minAboveAverageHits\r\n+\t -minNrIdsForInferencePeptide $minNrIdsForInferencePeptide\r\n+\t -refineModel $refineModel\r\n+\t -functionalAnnotationCSV $functionalAnnotationCSV\r\n+\t -outputCSV $outputCSV\r\n+\t -outputInferenceLogCSV $outputInferenceLogCSV\r\n+\t -outputSummaryAnnotationCSV $outputSummaryAnnotationCSV\r\n+\t -outReport $htmlReportFile\r\n+\t -outReportPicturesPath $htmlReportFile.files_path\r\n+\t #if $is2D_LC_MS.fractions == True\r\n+ \t-namingConventionCodesForFractions $is2D_LC_MS.namingConventionCodesForFractions\r\n+ #end if\r\n+\t</command>\r\n+\t\r\n+\t<inputs>\r\n+\t \t\r\n+ \t\t<repeat name="annotatedQuantificationFiles" title="Peptide (filtered) quantification files (APML)" \r\n+ \t\thelp="The APML contents as aligned, annotated and scored feature lists, \r\n+ \t\tas produced by MsFilt tool. Select one or more files. For 2D-LC-MS we expect one file per fraction.">\r\n+ \t\t\t<param name="annotatedQuantificationFile" size="50" type="data" format="apml" label="File (APML format)" />\r\n+ \t\t</repeat>\r\n+ \t\t\r\n+ \t\t<repeat name="identificationFiles" title="Peptide (filtered) identification files (MS/MS identifications)" \r\n+ \t\thelp="Full set of MS/MS peptide identification files, including peptides that could not be quantified.\r\n+ \t\tThis set of identifications is ideally filtered on some quality and \r\n+ \t\tstatistical measures (e.g. as is done by MsFilt). Tip: to base the inference only on the \r\n+ \t\tselected peptide quantification files, you\r\n+ \t\tcan select the same quantification files here as well. Select one or more files.">\r\n+ \t\t\t<param name="identificationFile" size="50" type="data" format="apml,mzid" label="File (APML or MZIDENTML format)" />\r\n+ \t\t</repeat>\r\n+ \t\t\r\n+ \t\t<conditional name="is2D_LC_MS">\r\n+ \t\t<param name="fractions" type="boolean" truevalue="Yes" falsevalue="No" checked="false" \r\n+ \t\tlabel="Data is from 2D LC-MS"\r\n+ \t\thelp="Data acquisition was done in multiple fractions."/>\r\n+ \t\t<when value="Yes"> \r\n+ \t\t\t<param name="namingConventionCodesForFractions" type="text" size="100" value="" \r\n+ \t\t\tlabel="Part of run/file name that identifies the 2D LC-MS fraction" \r\n+ \t\t\thelp="Add the CSV list of codes that occur in the file names \r\n+ \t\t\t\tand that stand for a fraction code. E.g. \'_F1,_F2,_F3,etc.\' In this\r\n+ \t\t\t\tway different peptide identifications from the same sample but measured \r\n+ \t\t\t\tin different fractions can be merged together. Otherwise each (fraction) file\r\n+ \t\t\t\tis seen as a separate sample."/> <!-- could do regular expressions as well but this would be hard for biologists, e.g. _F\\d\\b -->\r\n+ \t\t</when>\r\n+ \t</conditional>\r\n+ \t\t\r\n+ \t\t<param name="statisticalMeasuresConfig" type="text" area="true" size="6x70" label="Statistical measures configuration" \r\n+\t\t\t\thelp="Here you may specify the statistical measures that are found in the ms/ms results (e.g. p or e-values). \r\n+\t\t\t\tThe format is: SM alias => SM name,type,mode[min/max]. Leaving this configuration out while these are present in the\r\n+\t\t\t\tdataset will have the effect that they will be wrongly used as a regular scoring scheme, having effect on'..b' \t\r\n+ \t\r\n+ \t<param name="summaryReport" type="boolean" checked="true" label="Generate summary report"/>\r\n+ \t\r\n+\t</inputs>\r\n+\t<configfiles>\r\n+\t\t<configfile name="annotatedQuantificationFilesList">## start comment\r\n+\t\t## iterate over the selected files and store their names in the config file\r\n+\t\t#for $i, $s in enumerate( $annotatedQuantificationFiles )\r\n+\t\t\t${s.annotatedQuantificationFile}\r\n+\t\t#end for\r\n+\t\t## end comment</configfile>\r\n+\t\t\r\n+\t\t<configfile name="identificationFilesList">## start comment\r\n+\t\t## iterate over the selected files and store their names in the config file\r\n+\t\t#for $i, $s in enumerate( $identificationFiles )\r\n+\t\t\t${s.identificationFile}\r\n+\t\t\t## also print out the datatype in the next line, based on previously configured datatype\r\n+\t\t\t#if isinstance( $s.identificationFile.datatype, $__app__.datatypes_registry.get_datatype_by_extension(\'apml\').__class__):\r\n+\t\t\t\tapml\r\n+\t\t\t#else:\r\n+ \t\tmzid\r\n+ \t\t#end if\r\n+\t\t#end for\r\n+\t\t## end comment</configfile>\r\n+\t\t<configfile name="statisticalMeasuresConfigFile">## start comment\r\n+\t\t\t${statisticalMeasuresConfig}\r\n+\t\t</configfile>\r\n+\t</configfiles>\r\n+\t<outputs>\r\n+\t <data name="outputCSV" format="csv" label="${tool.name} on ${on_string}: Proteins list (CSV)" />\r\n+\t <data name="outputInferenceLogCSV" format="csv" label="${tool.name} on ${on_string}: Inference log (CSV)"/>\r\n+\t <data name="htmlReportFile" format="html" label="${tool.name} on ${on_string} - HTML report">\r\n+\t \t<!-- If the expression is false, the file is not created -->\r\n+\t \t<filter>( summaryReport == True )</filter>\r\n+\t </data>\r\n+\t <data name="outputSummaryAnnotationCSV" format="csv" label="${tool.name} on ${on_string} - Functional annotation summary (CSV)">\r\n+\t \t<!-- If the expression is false, the file is not created -->\r\n+\t \t<filter>( functionalAnnotationCSV != None )</filter>\r\n+\t </data>\r\n+\t</outputs>\r\n+\t<tests>\r\n+\t</tests>\r\n+ <help>\r\n+ \r\n+.. class:: infomark\r\n+ \r\n+This tool takes Peptide Quantification patterns and uses this to do Protein Inference of both Primary Protein \r\n+identifications as well as Secondary Protein identifications. This last class of protein identifications \r\n+can not be done by traditional protein inference methods that look only at peptide identifications and \r\n+their quality parameters. \r\n+\r\n+\r\n+-----\r\n+\r\n+**List of definitions**\r\n+\r\n+Primary Protein identification: protein identification belonging to the minimum set of proteins needed\r\n+to account for the observed peptides. \r\n+\r\n+Secondary Protein identification: extra protein identifications that do not below to the minimum set\r\n+of proteins mentioned above. \r\n+\r\n+raw intensities : is the intensity value resulting from the integration of the feature peak area\r\n+\r\n+apex intensities: is the intensity value as on the highest point of the feature peak\r\n+\r\n+normalized intensities : is the intensity normalized by some means\r\n+\r\n+-----\r\n+\r\n+**Minimum correlation in a cluster**\r\n+\r\n+TODO - add doc.\r\n+\r\n+-----\r\n+\r\n+**Output details**\r\n+\r\n+*Proteins list (CSV)*\r\n+\r\n+This is the list of primary and secondary proteins and their calculated inference score. Proteins \r\n+with exactly the same peptide hits are also grouped together and labeled as primary_group and secondary_group\r\n+instead of simply primary and secondary.\r\n+\r\n+\r\n+*Inference log (CSV)*\r\n+\r\n+This CSV table shows all data, both inferred and ruled out proteins. This can be used by the user to \r\n+troubleshoot the inference process and understand why certain proteins might have been ruled out. \r\n+The CSV is provided in such a format that the data can easily be explored in a Cytoscape network. \r\n+\r\n+The figure below shows an example of the data being explored in Cytoscape using also the \r\n+`Cytoscape chartplugin`_ to visualize the quantification data when selecting the peptide nodes. \r\n+\r\n+.. image:: $PATH_TO_IMAGES/quantifere_cyto_out.png \r\n+\r\n+\r\n+.. _Cytoscape chartplugin: http://apps.cytoscape.org/apps/chartplugin\r\n+\r\n+\r\n+\r\n+ </help>\r\n+</tool>\r\n' |
b |
diff -r 000000000000 -r d50f079096ee quantiline.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/quantiline.xml Wed Jan 08 11:39:16 2014 +0100 |
b |
@@ -0,0 +1,62 @@ +<tool name="Quantiline" id="quantiline1" version="1.0.2"> + <description>Labeled ms/ms data pre-processing for Protein Quantification (and Inference) pipelines</description> + <!-- + For remote debugging start you listener on port 8000 and use the following as command interpreter: + java -jar -Xdebug -Xrunjdwp:transport=dt_socket,address=D0100564.wurnet.nl:8000 + ////////////////////////// + --> + <command interpreter="java -jar "> + Quantiline.jar + -ppidsFileName $ppidsFileName + -spectraDataFile $spectraDataFile + -ppidsInputFormat MZID + -labelMzValues "$labelMzValues" + -labelmTol $labelmTol + -outputFile $outputFile + -outReport $outReport + </command> + <inputs> + + <param name="ppidsFileName" type="data" format="prims.fileset.zip" label="MS/MS peptide identifications fileSet (N mzidentml files)"/> + <param name="spectraDataFile" type="data" format="prims.fileset.zip" label="MS/MS spectra fileSet (N mzml files)"/> + + <param name="labelMzValues" type="text" size="20" label="Label m/z values" + help="e.g. for 4plexed iTRAQ : 114.0,115.0,116.0,117.0"/> + + <param name="labelmTol" type="float" size="10" value="0.5" label="Label detection tolerance (Da)" + help="Tolerance in daltons for label detection."/> + + </inputs> + <outputs> + <data name="outputFile" format="apml" label="${tool.name} on ${on_string}: Peptides quantification (APML)" /> + <data name="outReport" format="html" label="${tool.name} on ${on_string}: Peptides quantification report (HTML)"/> + </outputs> + <tests> + </tests> + <help> + +.. class:: infomark + +This tool can read spectra files (mzML) and their respective identification files (mzIdentML) and based +on the configured label masses produce a file that contains the merged information: +peptides and their quantification based on label fragment intensity values read from the spectrum in which they +were identified. + +In other words, it produces the peptide (relative) quantification file. This file can subsequently be used +by other tools for protein inference and protein quantification (e.g. Quantifere). + + +----- + +**Output details** + +*Peptide quantification file (APML)* + +This is the list of peptides with their (relative) quantification based on the labels and their +intensities found in the label peaks of the corresponding spectrum. + + + + + </help> +</tool> |
b |
diff -r 000000000000 -r d50f079096ee repository_dependencies.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/repository_dependencies.xml Wed Jan 08 11:39:16 2014 +0100 |
b |
@@ -0,0 +1,5 @@ +<?xml version="1.0"?> +<repositories description="Required proteomics dependencies."> + <repository toolshed="http://toolshed.g2.bx.psu.edu" name="proteomics_datatypes" owner="iracooke" changeset_revision="09b89b345de2" /> + <repository toolshed="http://testtoolshed.g2.bx.psu.edu" name="proteomics_datatypes" owner="iracooke" changeset_revision="7101f7e4b00b" /> +</repositories> |
b |
diff -r 000000000000 -r d50f079096ee sedmat.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/sedmat.xml Wed Jan 08 11:39:16 2014 +0100 |
b |
b'@@ -0,0 +1,144 @@\n+<tool name="SedMat" id="sedmat1" version="1.0.2">\r\n+\t<description>Matches MS and MS/MS results</description>\r\n+\t<!-- \r\n+\t For remote debugging start you listener on port 8000 and use the following as command interpreter:\r\n+\t java -jar -Xdebug -Xrunjdwp:transport=dt_socket,address=D0100564.wurnet.nl:8000 \r\n+\t -->\r\n+\t<command interpreter="java -jar ">\r\n+\t SedMat_cli.jar \r\n+\t -pl $inputMS \r\n+\t -plInputFormat apml \r\n+\t -ppids $fileType.inputFormatType.ppidsFile \r\n+\t -ppidsFileGrouping $fileType.type \r\n+\t -ppidsInputFormat $fileType.inputFormatType.ppidsInputFormat\r\n+\t -ppidsFileDescription $fileType.inputFormatType.ppidsFile.name \r\n+\t #if $fileType.inputFormatType.ppidsInputFormat == "mzid"\r\n+\t\t\t-spectraDataFile $fileType.inputFormatType.spectraDataFile\r\n+\t\t#end if \r\n+\t -out $outputData \r\n+\t -outUnmatchedMS2 $outUnmatchedMS2\r\n+\t -mtol $mtol \r\n+\t -rttol $rttol \r\n+\t -rtShiftDetectionWindow $rtShiftDetectionWindow\r\n+\t -matchOnSameSourceOnly $matchOnSameSourceOnly\r\n+\t -chargeStatesToGenerate $chargeStatesToGenerate\r\n+\t -outReport $htmlReportFile\r\n+\t -outReportPicturesPath $htmlReportFile.files_path\r\n+ #if $troubleshoot1.troubleshootPeakLocations == True\r\n+ \t-troubleshootPeakLocations YES\r\n+ \t-mStart $troubleshoot1.mStart\r\n+ \t-mEnd $troubleshoot1.mEnd\r\n+ \t-rtStart $troubleshoot1.rtStart\r\n+ \t-rtEnd $troubleshoot1.rtEnd\r\n+ \t-filterSourceName $troubleshoot1.filterSourceName\r\n+ #end if\r\n+ #if $matchOnNamingConvention.match == True\r\n+ \t-matchOnNamingConvention YES\r\n+ \t-namingConventionCodesForMatching $matchOnNamingConvention.namingConventionCodesForMatching\r\n+ #end if\r\n+ \t \r\n+\t</command>\r\n+\t\r\n+\t<inputs>\r\n+\t \t\r\n+ \t\t<param name="inputMS" type="data" format="apml" label="MS data (APML format)" />\r\n+\t \t<!-- possible option <validator type="metadata" check="base_name" message="Metadata missing, click the pencil icon in the history item and set base_name."/> -->\r\n+\r\n+\t \t<conditional name="fileType">\r\n+\t\t <param name="type" type="select" label="Peptide identification file grouping type">\r\n+\t\t <option value="single" selected="true">single-File</option>\r\n+\t\t <option value="fileSet">fileSet</option>\r\n+\t\t </param>\r\n+\t\t <when value="single">\r\n+\t\t <conditional name="inputFormatType">\r\n+\t\t \t<param name="ppidsInputFormat" type="select" label="MS/MS input format">\r\n+\t\t\t \t<option value="mzid" selected="true">mzIdentML on mzML</option>\r\n+\t\t\t \t<option value="apml">APML</option>\r\n+\t\t\t\t</param>\r\n+\t\t\t\t<when value="mzid">\r\n+\t\t \t\t<param name="spectraDataFile" type="data" format="mzml" label="MS/MS spectra file (mzml)"/>\r\n+\t\t \t\t<param name="ppidsFile" type="data" format="mzid" label="MS/MS peptide identifications file (mzidentml)"/>\r\n+\t\t \t</when>\r\n+\t\t \t<when value="apml">\r\n+\t\t \t\t<param name="ppidsFile" type="data" format="apml" label="MS/MS peptide identifications file (apml)">\r\n+\t\t \t\t\t<!-- TODO - find out how to use\r\n+\t\t \t\t\t<validator type="expression" message="You already selected this file as the MS data file.">value.id == inputMS,{"inputMS":$inputMS},{}</validator>-->\r\n+\t\t \t\t</param>\r\n+\t\t \t</when>\r\n+\t\t </conditional>\r\n+\t\t </when>\r\n+\t\t <when value="fileSet">\r\n+\t\t <conditional name="inputFormatType">\r\n+\t\t \t<param name="ppidsInputFormat" type="select" label="inputFormat">\r\n+\t\t\t \t<option value="mzid" selected="true">mzIdentML on mzML</option>\r\n+\t\t\t\t</param>\r\n+\t\t\t\t<when value="mzid">\r\n+\t\t \t\t<param name="spectraDataFile" type="data" format="prims.fileset.zip" label="MS/MS spectra fileSet (N mzml files)"/>\r\n+\t\t \t\t<param name="ppidsFile" type="data" format="prims.fileset.zip" label="MS/MS peptide identifications fileSet (N mzidentml files)"/>\r\n+\t\t \t</when>\r\n+\t\t </conditional>\r\n+\t\t </when>\r\n+\t\t</conditional>\r\n+\t\t<param name="mtol" type="integer" size="10" value="50" lab'..b' tolerance (ppm) " />\r\n+\t\t<param name="rttol" type="integer" size="10" value="150" label="Rention time tolerance (seconds) " />\r\n+\t\t<param name="rtShiftDetectionWindow" type="integer" size="10" value="20" label="Rention time shift detection window (seconds) " help="Size of the window to use for average rt shift calculations"/>\r\n+\r\n+\t\t<param name="matchOnSameSourceOnly" type="boolean" checked="false" label="Match peaks from same source only" help="If you want this, you might have to inform how to match the source files"/>\r\n+ \t<conditional name="matchOnNamingConvention">\r\n+ \t\t<param name="match" type="boolean" truevalue="Yes" falsevalue="No" checked="false" label="Match using naming convention" help="Use a list of codes that occur in the file names and that link them together."/>\r\n+ \t\t<when value="Yes">\r\n+ \t\t\t<param name="namingConventionCodesForMatching" type="text" size="100" value="" label="List of codes in naming convention" help="Add the CSV list of codes that occur in the file names and that link them together. E.g. \'_F1,_F2,_F3,etc.\'"/>\r\n+ \t\t</when>\r\n+ \t</conditional>\t \r\n+\r\n+ \t\t<param name="chargeStatesToGenerate" type="select" display="checkboxes" multiple="true" label="Generate extra charge states" help="The selected charge states will be generated for each MS2 feature ">\r\n+\t \t<option value="1" selected="true">1</option>\r\n+\t \t<option value="2" selected="true">2</option>\r\n+\t \t<option value="3" selected="true">3</option>\r\n+\t \t<option value="4" selected="true">4</option>\r\n+\t \t<option value="5">5</option>\r\n+\t\t</param>\r\n+\r\n+ \t\t<param name="summaryReport" type="boolean" checked="true" label="Generate summary report" help="NB: this will increase the processing time"/>\r\n+ \t\r\n+ \t<conditional name="troubleshoot1">\r\n+ \t\t<param name="troubleshootPeakLocations" type="boolean" truevalue="Yes" falsevalue="No" checked="false" label="Troubleshoot ms1/ms2 peak locations" help="Small trial run to check if the MS and MS/MS peak lists in their current states can easily be matched "/>\r\n+ \t\t<when value="Yes">\r\n+ \t\t\t<param name="mStart" optional="false" type="integer" size="10" value="100" label="Set m/z start " />\r\n+ \t\t\t<param name="mEnd" optional="false" type="integer" size="10" value="1000" label="Set m/z end " />\r\n+\t\t\t\t<param name="rtStart" optional="false" type="integer" size="10" value="10" label="Set rention time start (minutes) " />\r\n+\t\t\t\t<param name="rtEnd" optional="false" type="integer" size="10" value="20" label="Set rention time end (minutes) " />\r\n+\t\t\t\t<param name="filterSourceName" type="text" size="100" value="" label="Restrict matching to a specific subset of the files " help="Part of a file name that occurs in both a ms1 and ms2 file (e.g. \'RibO_1_msE1\')"/>\r\n+ \t\t</when>\r\n+ \t</conditional>\r\n+ \t\r\n+\t</inputs>\r\n+\t<outputs>\r\n+\t <data name="outputData" format="apml" label="${inputMS.metadata.base_name} - ${tool.name} on ${on_string}: APML" metadata_source="inputMS"></data>\r\n+\t <data name="outUnmatchedMS2" format="csv" label="${inputMS.metadata.base_name} - ${tool.name} on ${on_string}: unmatched MS2 features CSV" metadata_source="inputMS"></data>\r\n+\t <data name="htmlReportFile" format="html" label="${tool.name} on ${on_string} - HTML report">\r\n+\t \t<!-- If the expression is false, the file is not created -->\r\n+\t \t<filter>( summaryReport == True )</filter>\r\n+\t </data>\r\n+\t</outputs>\r\n+\t<tests>\r\n+\t <!-- find out how to use -->\r\n+\t <test>\r\n+\t </test>\r\n+\t</tests>\r\n+ <help>\r\n+ \r\n+.. class:: infomark\r\n+ \r\n+This tool matches MS and MS/MS results. SEDMAT stands for "Single Experiment Data Matching Tool".\r\n+It can match peaks found in the MS spectra with the peptides found using the MS/MS spectra.\r\n+The result is the list of MS peaks annotated with peptides and proteins.\r\n+\r\n+-----\r\n+\r\n+**Output example**\r\n+\r\n+This tools returns APML output, a Cytoscape network (.xgmml) of the matches and Retention Time plots (.pdf). \r\n+\r\n+ </help>\r\n+</tool>\r\n' |
b |
diff -r 000000000000 -r d50f079096ee static/images/msfilt_csv_out.png |
b |
Binary file static/images/msfilt_csv_out.png has changed |
b |
diff -r 000000000000 -r d50f079096ee static/images/napq_overview.png |
b |
Binary file static/images/napq_overview.png has changed |
b |
diff -r 000000000000 -r d50f079096ee static/images/quantifere_cyto_out.png |
b |
Binary file static/images/quantifere_cyto_out.png has changed |