comparison purityA.xml @ 0:56cce1a90b73 draft

"planemo upload for repository https://github.com/computational-metabolomics/mspurity-galaxy commit cb903cd93f9378cfb5eeb68512a54178dcea7bbc-dirty"
author computational-metabolomics
date Wed, 27 Nov 2019 14:26:04 -0500
parents
children 0d73912c7cdc
comparison
equal deleted inserted replaced
-1:000000000000 0:56cce1a90b73
1 <tool id="mspurity_puritya" name="msPurity.purityA" version="@TOOL_VERSION@+galaxy@GALAXY_TOOL_VERSION@">
2 <description>
3 Assess acquired precursor ion purity of MS/MS spectra
4 </description>
5 <macros>
6 <import>macros.xml</import>
7 </macros>
8 <expand macro="requirements" />
9 <command detect_errors="exit_code"><![CDATA[
10 Rscript '$__tool_directory__/purityA.R'
11 --out_dir='.'
12
13 --mzML_files='
14 #for $i in $source
15 $i,
16 #end for
17 '
18 --galaxy_names='
19 #for $i in $source
20 $i.name,
21 #end for
22 '
23
24 #if $offsets.offsets == 'user'
25 --minOffset=$minoffset
26 --maxOffset=$maxoffset
27 #end if
28
29 --iwNorm=$iw_norm
30 --ilim=$ilim
31
32 --cores=\${GALAXY_SLOTS:-4}
33
34 #if $nearest
35 --nearest
36 #end if
37
38 #if $mostIntense
39 --mostIntense
40 #end if
41
42 #if $isotopes.isotopes == "exclude_default"
43 --exclude_isotopes
44 #elif $isotopes.isotopes == "user"
45 --exclude_isotopes
46 --isotope_matrix='$isotopes.im'
47 #end if
48
49 --ppmInterp $ppmInterp
50
51 ]]></command>
52 <inputs>
53 <param name="source" type="data" multiple="true" format="mzml" label="*.mzML file" >
54 <validator type="empty_field" />
55 </param>
56 <param argument="--mostIntense" type="boolean" checked="true"
57 label="Use most intense peak within isolation window for precursor?"
58 help="If yes, this will ignore the recorded precursor within the mzML file and use
59 use the most intense peak within the isolation window to calculate the precursor ion purity"/>
60
61 <param argument="--nearest" type="boolean" checked="true"
62 label="Use nearest full scan to determine precursor?"
63 help="If TRUE, this will use the neareset full scan to the fragmentation scan to determine what the m/z value
64 is of the precursor"/>
65
66 <param argument="--ppmInterp" type="float" label="Interpolation PPM" min="0" value="7"
67 help="Set the ppm tolerance for the precursor ion purity interpolation.
68 i.e. the ppm tolerence between the precursor ion found in the neighbouring scans. The closest match
69 within the window will be used for the interpolation"/>
70
71 <conditional name="offsets">
72 <param name="offsets" type="select" label="offsets" >
73 <option value="auto" selected="true">Uses offsets determined in the mzML file</option>
74 <option value="user">User supplied offset values</option>
75 </param>
76 <when value="user">
77 <expand macro="offsets" />
78 </when>
79 <when value="auto"/>
80
81 </conditional>
82
83 <expand macro="general_params" />
84
85 </inputs>
86 <outputs>
87 <data name="purityA_output_tsv" format="tsv" label="${tool.name} on ${on_string}: tsv"
88 from_work_dir="purityA_output.tsv" />
89 <data name="purityA_output_rdata" format="rdata" label="${tool.name} on ${on_string}: RData"
90 from_work_dir="purityA_output.RData" />
91 </outputs>
92 <tests>
93 <test>
94 <param name="source" value="LCMSMS_2.mzML,LCMSMS_1.mzML,LCMS_2.mzML,LCMS_1.mzML" ftype="mzml" />
95 <output name="purityA_output_tsv" value="purityA_output.tsv" />
96 <output name="purityA_output_rdata" value="purityA_output.RData" ftype="rdata" compare="sim_size"/>
97 </test>
98 </tests>
99
100 <help><![CDATA[
101 =============================================================
102 Assess precursor ion purity of MS/MS files
103 =============================================================
104 -----------
105 Description
106 -----------
107
108 **General**
109
110 Tool based on the msPurity::purityA R class used to calculate the the precursor ion purity of each MS/MS scan for mzML files.
111
112 Data input of mzML files either from:
113
114 * A data collection of the mzML files containing MS/MS scans
115 * A path to a folder that has mzML files containing MS/MS scans
116
117 The precursor ion purity represents the measure of the contribution of a selected precursor
118 peak in an isolation window used for fragmentation and can be used as away of assessing the
119 spectral quality and level of "contamination" of fragmentation spectra.
120
121 The calculation involves dividing the intensity of the selected precursor peak by the total
122 intensity of the isolation window and is performed before and after the MS/MS scan of
123 interest and interpolated at the recorded time of the MS/MS acquisition.
124
125 Additionally, isotopic peaks are annotated and omitted from the calculation,
126 low abundance peaks are removed that are thought to have minor contribution to the
127 resulting MS/MS spectra and the isolation efficiency of the mass spectrometer can be
128 used to normalise the intensities used for the calculation.
129
130 The output is an RData file with the purityA S4 class object (referred to as pa for convenience throughout
131 the manual). The object contains a slot (pa@puritydf) where the details of the purity
132 assessments for each MS/MS scan. The purityA object can then be used for further processing
133 including linking the fragmentation spectra to XCMS features, averaging fragmentation,
134 database creation and spectral matching (from the created database).
135
136 There is also the additional output of the a tsv file of the pa@puritydf data frame.
137
138 **Example LC-MS/MS processing workflow**
139
140 The purityA object can be used for further processing including linking the fragmentation spectra to XCMS features,
141 averaging fragmentation, database creation and spectral matching (from the created database). See below for an example workflow:
142
143 * Purity assessments
144 + (mzML files) -> **purityA** -> (pa)
145 * XCMS processing
146 + (mzML files) -> xcms.xcmsSet -> xcms.merge -> xcms.group -> xcms.retcor -> xcms.group -> (xset)
147 * Fragmentation processing
148 + (xset, pa) -> frag4feature -> filterFragSpectra -> averageAllFragSpectra -> createDatabase -> spectralMatching -> (sqlite spectral database)
149
150 **Isolation efficiency**
151
152 When the isolation efficiency of an MS instrument is known the peak intensities within an isolation window can be normalised for the precursor purity calculation. The isolation efficiency can be estimated by measuring a single precursor across a sliding window. See figure 3 from the original msPurity paper (Lawson et al 2017). This has been experimentally measured for a Thermo Fisher Q-Exactive Mass spectrometer using 0.5 Da windows and can be set within msPurity by using msPurity::iwNormQE.5() as the input to the iwNormFunc argument.
153
154 Other options to model the isolation efficiency the gaussian isolation window msPurity::iwNormGauss(minOff=-0.5, maxOff = 0.5) or a R-Cosine window msPurity::iwNormRCosine(minOff=-0.5, maxOff=0.5). Where the minOff and maxOff can be altered depending on the isolation window size.
155
156 A user can also define their own normalisation function. The only requirement of the function is that given a value between the minOff and maxOff a normalisation value between 0-1 is returned.
157
158 **Notes regarding instrument specific isolation window offsets used:**
159
160 * The isolation widths offsets will be automatically determined from extracting metadata from the mzML file. However, for some vendors though this is not recorded, in these cases the offsets should be given by the user as an argument (offsets).
161 * In the case of Agilent only the "narrow" isolation is supported. This roughly equates to +/- 0.65 Da (depending on the instrument). If the file is detected as originating from an Agilent instrument the isolation widths will automatically be set as +/- 0.65 Da.
162
163
164 See Bioconductor documentation for more details about the function used, msPurity::purityA().
165
166 -----------
167 Outputs
168 -----------
169
170 * purity_msms: A tsv file of all the precursor ion purity score (and other metrics) of each fragmentation spectra
171 * purity_msms_rdata: The purityA object saved as an rdata file
172
173 The purity_msms tsv file consists of the following columns:
174
175 * pid: unique id for MS/MS scan
176 * fileid: unique id for mzML file
177 * seqNum: scan number
178 * precursorIntensity: precursor intensity value as defined in the mzML file
179 * precursorMZ: precursor m/z value as defined in the mzML file
180 * precursorRT: precursor RT value as defined in the mzML file
181 * precursorScanNum: precursor scan number value as defined in mzML file
182 * id: unique id (redundant)
183 * filename: mzML filename
184 * precursorNearest: MS1 scan nearest to the MS/MS scan
185 * aMz: The m/z value in the "precursorNearest" MS1 scan which most closely matches the precursorMZ value provided from the mzML file
186 * aPurity: The purity score for aMz
187 * apkNm: The number of peaks in the isolation window for aMz
188 * iMz: The m/z value in the precursorNearest MS1 scan that is the most intense within the isolation window.
189 * iPurity: The purity score for iMz
190 * ipkNm: The number of peaks in the isolation window for iMz
191 * inPurity: The interpolated purity score (the purity score is calculated at neighbouring MS1 scans and interpolated at the point of the MS/MS acquisition)
192 * inpkNm: The interpolated number of peaks in the isolation window
193
194
195 ]]></help>
196 <expand macro="citations" />
197 </tool>