comparison msstats.xml @ 3:8212e342e482 draft

"planemo upload for repository https://github.com/galaxyproteomics/tools-galaxyp/tree/master/tools/msstats commit ad490a2f231f5ee1b6db160c117181e693ea1079"
author galaxyp
date Thu, 28 Jan 2021 20:48:40 +0000
parents 52ac6fde9a5b
children 593839e1f2c3
comparison
equal deleted inserted replaced
2:52ac6fde9a5b 3:8212e342e482
1 <tool id="msstats" name="MSstats" version="@VERSION@.0"> 1 <tool id="msstats" name="MSstats" version="@VERSION@.0">
2 <description>statistical relative protein significance analysis in DDA, SRM and DIA Mass Spectrometry</description> 2 <description>statistical relative protein significance analysis in DDA, SRM and DIA Mass Spectrometry</description>
3 <macros> 3 <macros>
4 <token name="@VERSION@">3.20.1</token> 4 <token name="@VERSION@">3.22.0</token>
5 <xml name="useUniquePeptide"> 5 <xml name="useUniquePeptide">
6 <param name="useUniquePeptide" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="true" label="remove peptides that are assigned for more than one proteins" help="We assume to use unique peptide for each protein"/> 6 <param name="useUniquePeptide" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="true" label="Remove peptides that are assigned for more than one proteins"/>
7 </xml> 7 </xml>
8 <xml name="summaryforMultipleRows"> 8 <xml name="summaryforMultipleRows">
9 <param name="summaryforMultipleRows" type="select" label="Summary for MultipleRows" help="summaryforMultipleRows - when there are multiple measurements for certain feature and certain run, use highest or sum of all"> 9 <param name="summaryforMultipleRows" type="select" label="Summary for MultipleRows" help="When there are multiple measurements for certain feature and certain run, use highest or sum of all">
10 <option value="max" selected="true">max</option> 10 <option value="max" selected="true">max</option>
11 <option value="sum">sum</option> 11 <option value="sum">sum</option>
12 </param> 12 </param>
13 </xml> 13 </xml>
14 <xml name="fewMeasurements"> 14 <xml name="fewMeasurements">
15 <param name="fewMeasurements" type="select" label="Remove the features that have 1 or 2 measurements across runs" help="(fewMeasurements)"> 15 <param name="fewMeasurements" type="select" label="Features with few measurements " help="Remove the features that have 1 or 2 measurements across runs or keep all features or keep all features (the latter could give an error in fitting the statistical model)">
16 <option value="remove" selected="true">remove</option> 16 <option value="remove" selected="true">remove</option>
17 <option value="keep">keep</option> 17 <option value="keep">keep</option>
18 </param> 18 </param>
19 </xml> 19 </xml>
20 <xml name="removeProtein_with1Peptide"> 20 <xml name="removeProtein_with1Peptide">
30 Rscript '$msstats_script' 30 Rscript '$msstats_script'
31 && cat msstats*.log > '$log' 31 && cat msstats*.log > '$log'
32 ]]></command> 32 ]]></command>
33 <configfiles> 33 <configfiles>
34 <configfile name="msstats_script"><![CDATA[ 34 <configfile name="msstats_script"><![CDATA[
35
35 library('MSstats', warn.conflicts = F, quietly = T, verbose = F) 36 library('MSstats', warn.conflicts = F, quietly = T, verbose = F)
36 37
37 #if $input.input_src == 'MSstats' 38 #if $input.input_src == 'MSstats'
38 39
39 #if $input.msstats_input.is_of_type('csv') 40 #if $input.msstats_input.is_of_type('csv')
49 mq_proteinGroups <- read.table("$input.proteinGroups", sep="\t", header=TRUE) 50 mq_proteinGroups <- read.table("$input.proteinGroups", sep="\t", header=TRUE)
50 51
51 \# Read in annotation including condition and biological replicates per run. 52 \# Read in annotation including condition and biological replicates per run.
52 \# Users should make this annotation file. It is not the output from MaxQuant. 53 \# Users should make this annotation file. It is not the output from MaxQuant.
53 #if $input.annotation.is_of_type('csv') 54 #if $input.annotation.is_of_type('csv')
54 annot <- read.csv("$input.annotation", header=TRUE) 55 annot <- read.csv("$input.annotation", header=TRUE)
55 #else 56 #else
56 annot <- read.table("$input.annotation", sep="\t", header=TRUE) 57 annot <- read.table("$input.annotation", sep="\t", header=TRUE)
57 #end if 58 #end if
58 59
59 raw <- MaxQtoMSstatsFormat(evidence=mq_evidence, 60 raw <- MaxQtoMSstatsFormat(evidence=mq_evidence,
60 proteinGroups=mq_proteinGroups, 61 proteinGroups=mq_proteinGroups,
61 annotation=annot, 62 annotation=annot,
62 proteinID="$input.proteinID", 63 proteinID="$input.proteinID",
63 useUniquePeptide=$input.input_options.useUniquePeptide, 64 useUniquePeptide=$input.input_options.useUniquePeptide,
64 summaryforMultipleRows=$input.input_options.summaryforMultipleRows, 65 summaryforMultipleRows=$input.input_options.summaryforMultipleRows,
67 removeOxidationMpeptides=$input.input_options.removeOxidationMpeptides, 68 removeOxidationMpeptides=$input.input_options.removeOxidationMpeptides,
68 removeProtein_with1Peptide=$input.input_options.removeProtein_with1Peptide) 69 removeProtein_with1Peptide=$input.input_options.removeProtein_with1Peptide)
69 70
70 #elif $input.input_src == 'OpenMS' 71 #elif $input.input_src == 'OpenMS'
71 72
72 #if $input.evidence.is_of_type('csv') 73 #if $input.openms_input.is_of_type('csv')
73 input <- read.csv("$input.evidence", header=TRUE) 74 input <- read.csv("$input.openms_input", header=TRUE)
74 #else 75 #else
75 input <- read.table("$input.evidence", sep="\t", header=TRUE) 76 input <- read.table("$input.openms_input", sep="\t", header=TRUE)
76 #end if 77 #end if
77 #if $input.annotation.is_of_type('csv') 78
78 annot <- read.csv("$input.annotation", header=TRUE) 79 #if $input.annotation:
79 #else 80 #if $input.annotation.is_of_type('csv')
80 annot <- read.table("$input.annotation", sep="\t", header=TRUE) 81 annot <- read.csv("$input.annotation", header=TRUE)
81 #end if 82 #else
82 83 annot <- read.table("$input.annotation", sep="\t", header=TRUE)
83 raw <- OpenMStoMSstatsFormat(input, 84 #end if
85 #end if
86
87 raw <- OpenMStoMSstatsFormat(input,
88 #if $input.annotation:
84 annotation=annot, 89 annotation=annot,
90 #end if
85 useUniquePeptide=$input.input_options.useUniquePeptide, 91 useUniquePeptide=$input.input_options.useUniquePeptide,
86 summaryforMultipleRows=$input.input_options.summaryforMultipleRows, 92 summaryforMultipleRows=$input.input_options.summaryforMultipleRows,
87 fewMeasurements="$input.input_options.fewMeasurements", 93 fewMeasurements="$input.input_options.fewMeasurements",
88 removeProtein_with1Peptide=$input.input_options.removeProtein_with1Peptide) 94 removeProtein_with1Feature=$input.input_options.removeProtein_with1Feature)
95
89 96
90 #elif $input.input_src == 'OpenSWATH' 97 #elif $input.input_src == 'OpenSWATH'
91 98
92 #if $input.evidence.is_of_type('csv') 99 #if $input.openswath_input.is_of_type('csv')
93 input <- read.csv("$input.evidence", header=TRUE) 100 input <- read.csv("$input.openswath_input", header=TRUE)
94 #else 101 #else
95 input <- read.table("$input.evidence", sep="\t", header=TRUE) 102 input <- read.table("$input.openswath_input", sep="\t", header=TRUE)
96 #end if 103 #end if
97 #if $input.annotation.is_of_type('csv') 104 #if $input.annotation.is_of_type('csv')
98 annot <- read.csv("$input.annotation", header=TRUE) 105 annot <- read.csv("$input.annotation", header=TRUE)
99 #else 106 #else
100 annot <- read.table("$input.annotation", sep="\t", header=TRUE) 107 annot <- read.table("$input.annotation", sep="\t", header=TRUE)
101 #end if 108 #end if
102 109
103 raw <- OpenSWATHtoMSstatsFormat(input, 110 raw <- OpenSWATHtoMSstatsFormat(input,
104 annotation=annot, 111 annotation=annot,
105 filter_with_mscore=$input.input_options.filter_with_mscore, 112 filter_with_mscore=$input.input_options.filter_with_mscore,
107 useUniquePeptide=$input.input_options.useUniquePeptide, 114 useUniquePeptide=$input.input_options.useUniquePeptide,
108 fewMeasurements="$input.input_options.fewMeasurements", 115 fewMeasurements="$input.input_options.fewMeasurements",
109 removeProtein_with1Feature=$input.input_options.removeProtein_with1Feature, 116 removeProtein_with1Feature=$input.input_options.removeProtein_with1Feature,
110 summaryforMultipleRows=$input.input_options.summaryforMultipleRows) 117 summaryforMultipleRows=$input.input_options.summaryforMultipleRows)
111 118
119 #elif $input.input_src == 'Skyline'
120
121 #if $input.skyline_input.is_of_type('csv')
122 input <- read.csv("$input.skyline_input", header=TRUE)
123 #else
124 input <- read.table("$input.skyline_input", sep="\t", header=TRUE)
125 #end if
126
127 #if $input.annotation:
128 #if $input.annotation.is_of_type('csv')
129 annot <- read.csv("$input.annotation", header=TRUE)
130 #else
131 annot <- read.table("$input.annotation", sep="\t", header=TRUE)
132 #end if
133 #end if
134
135 raw <- SkylinetoMSstatsFormat(input,
136 #if $input.annotation:
137 annotation = annot,
138 #end if
139 removeiRT = $input.input_options.removeiRT,
140 filter_with_Qvalue = $input.input_options.filter_with_Qvalue,
141 qvalue_cutoff = $input.input_options.qvalue_cutoff,
142 useUniquePeptide = $input.input_options.useUniquePeptide,
143 fewMeasurements="$input.input_options.fewMeasurements",
144 removeOxidationMpeptides = $input.input_options.removeOxidationMpeptides,
145 removeProtein_with1Feature = $input.input_options.removeProtein_with1Feature)
146
112 #end if 147 #end if
113 148
114 processed_data <- dataProcess(raw, 149 processed_data <- dataProcess(raw,
115 logTrans=$dp_options.logTrans, 150 logTrans=$dp_options.logTrans,
116 normalization="$dp_options.norm.normalization", 151 normalization="$dp_options.norm.normalization",
117 #if $dp_options.norm.normalization == 'globalStandards' 152 #if $dp_options.norm.normalization == 'globalStandards'
118 nameStandards=c($dp_options.norm.nameStandards), 153 nameStandards=c($dp_options.norm.nameStandards),
119 #end if 154 #end if
120 ## address=$dp_options.address,
121 fillIncompleteRows=$dp_options.fillIncompleteRows, 155 fillIncompleteRows=$dp_options.fillIncompleteRows,
122 featureSubset="$dp_options.features.featureSubset", 156 featureSubset="$dp_options.features.featureSubset",
123 #if $dp_options.features.featureSubset == 'topN' 157 #if $dp_options.features.featureSubset == 'topN'
124 n_top_feature=$dp_options.features.n_top_feature, 158 n_top_feature=$dp_options.features.n_top_feature,
125 #end if 159 #end if
138 censoredInt=NULL, 172 censoredInt=NULL,
139 #else 173 #else
140 censoredInt="$dp_options.censoredInt", 174 censoredInt="$dp_options.censoredInt",
141 #end if 175 #end if
142 cutoffCensored="$dp_options.cutoffCensored", 176 cutoffCensored="$dp_options.cutoffCensored",
143 maxQuantileforCensored=$dp_options.maxQuantileforCensored, 177 maxQuantileforCensored = $dp_options.maxQuantileforCensored)
144 clusters=NULL)
145 178
146 #if 'processed_data' in $selected_outputs 179 #if 'processed_data' in $selected_outputs
147 write.table(processed_data\$ProcessedData, "ProcessedData.tsv", sep = "\t", quote = F, row.names = F, dec = ".") 180 write.table(processed_data\$ProcessedData, "ProcessedData.tsv", sep = "\t", quote = F, row.names = F, dec = ".")
148 #end if 181 #end if
149 #if 'runlevel_data' in $selected_outputs 182 #if 'runlevel_data' in $selected_outputs
150 write.table(processed_data\$RunlevelData, "RunlevelData.tsv", sep = "\t", quote = F, row.names = F, dec = ".") 183 write.table(processed_data\$RunlevelData, "RunlevelData.tsv", sep = "\t", quote = F, row.names = F, dec = ".")
151 #end if 184 #end if
152 185
153 #if 'qcplot' in $selected_outputs 186 #for $plot_type in $selected_outputs
154 dataProcessPlots(data = processed_data, type="QCplot", ylimUp=35, 187 #if $plot_type[-4:] == "Plot"
155 width=5, height=5, address="MSStats_only_") 188
156 #end if 189 dataProcessPlots(data = processed_data,
157 190 type = '$plot_type',
158 #if 'profile_plot' in $selected_outputs 191 featureName = "$out_plots_opt.featureName",
159 dataProcessPlots(data = processed_data, type="ProfilePlot", ylimUp=35, featureName="NA", width=5, height=5, address="MSStats_only_") 192 #if $out_plots_opt.ylimUp:
160 #end if 193 ylimUp = $out_plots_opt.ylimUp,
161 194 #end if
162 #if 'condition_plot' in $selected_outputs 195 #if $out_plots_opt.ylimDown:
163 dataProcessPlots(data = processed_data, type="ConditionPlot", width=5, height=5, address="MSStats_only_") 196 ylimDown = $out_plots_opt.ylimDown,
164 #end if 197 #end if
198 scale = $out_plots_opt.scale,
199 interval = "$out_plots_opt.interval",
200 x.axis.size = $out_plots_opt.x_axis_size,
201 y.axis.size = $out_plots_opt.y_axis_size,
202 text.size = $out_plots_opt.text_size,
203 text.angle = $out_plots_opt.text_angle,
204 legend.size = $out_plots_opt.legend_size,
205 dot.size.profile = $out_plots_opt.dot_size_profile,
206 dot.size.condition = $out_plots_opt.dot_size_condition,
207 width = $out_plots_opt.width,
208 height = $out_plots_opt.height,
209 #if $out_plots_opt.which_Protein.select != 'list'
210 which.Protein = "$out_plots_opt.which_Protein.select",
211 #else
212 which.Protein = unlist(read.table("$out_plots_opt.which_Protein.protein_list", sep = "\n", header = FALSE), use.names = FALSE),
213 #end if
214 remove_uninformative_feature_outlier = $out_plots_opt.remove_uninformative_feature_outlier,
215 address="MSStats_only_")
216
217 #end if
218 #end for
165 219
166 ## Quantifiaction 220 ## Quantifiaction
167 #if 'quant_sample_matrix' in $selected_outputs 221 #if 'quant_sample_matrix' in $selected_outputs
168 sampleQuantMatrix <- quantification(processed_data, type="Sample") 222 sampleQuantMatrix <- quantification(processed_data, type="Sample")
169 write.table(sampleQuantMatrix, "SampleQuantificationMatrix.tsv", sep = "\t", quote = F, row.names = F, dec = ".") 223 write.table(sampleQuantMatrix, "SampleQuantificationMatrix.tsv", sep = "\t", quote = F, row.names = F, dec = ".")
216 write.table(comparisons\$ModelQC, "ModelQC.tsv", sep = "\t", quote = F, row.names = F, dec = ".") 270 write.table(comparisons\$ModelQC, "ModelQC.tsv", sep = "\t", quote = F, row.names = F, dec = ".")
217 #end if 271 #end if
218 272
219 ## Visualizations: 273 ## Visualizations:
220 274
221 #if 'qqplot' in $group.select_outputs 275 #for $plot_type in $group.select_outputs
222 \# normal quantile-quantile plots 276
223 modelBasedQCPlots(data=comparisons, type="QQPlots", 277 #if $plot_type == "QQPlots" or $plot_type == "ResidualPlots"
224 width=5, height=5, address="MSStats_group_") 278
225 #end if 279 modelBasedQCPlots(data = comparisons,
226 280 type = "$plot_type",
227 #if 'residualplot' in $group.select_outputs 281 axis.size = $comparison_plots_opt.axis_size,
228 \# residual plots 282 dot.size = $comparison_plots_opt.dot_size,
229 modelBasedQCPlots(data=comparisons, type="ResidualPlots", 283 text.size = $comparison_plots_opt.text_size,
230 width=5, height=5, address="MSStats_group_") 284 legend.size = $comparison_plots_opt.legend_size,
231 #end if 285 width = $comparison_plots_opt.width,
232 286 height = $comparison_plots_opt.height,
233 #if 'volcanoplot' in $group.select_outputs 287 #if $comparison_plots_opt.which_Protein.select != 'list'
234 \# volcano plot 288 which.Protein = "$comparison_plots_opt.which_Protein.select",
235 groupComparisonPlots(data = comparisons\$ComparisonResult, type = 'VolcanoPlot',ProteinName=FALSE, 289 #else
236 width=5, height=5, address="MSStats_group_") 290 which.Protein = unlist(read.table("$comparison_plots_opt.which_Protein.protein_list", sep = "\n", header = FALSE), use.names = FALSE),
237 #end if 291 #end if
238 292 address="MSStats_group_")
239 #if 'heatmap' in $group.select_outputs 293
240 \# heatmap - works only for more than 1 comparison 294
241 if (nrow(comparison)>1) 295 #elif $plot_type == "VolcanoPlot" or $plot_type == "Heatmap" or $plot_type == "ComparisonPlot"
242 { 296
243 groupComparisonPlots(data = comparisons\$ComparisonResult, type = 'Heatmap', address="MSStats_group_") 297 groupComparisonPlots(data = comparisons\$ComparisonResult,
244 } 298 type = "$plot_type",
245 #end if 299 sig = $comparison_plots_opt.sig,
246 300 #if $comparison_plots_opt.FCcutoff:
247 #if 'comparisonplot' in $group.select_outputs 301 FCcutoff = $comparison_plots_opt.FCcutoff,
248 \#comparison 302 #end if
249 groupComparisonPlots(data=comparisons\$ComparisonResult, type="ComparisonPlot", 303 logBase.pvalue = $comparison_plots_opt.logBase_pvalue,
250 width=5, height=5, address="MSStats_group_") 304 #if $comparison_plots_opt.ylimUp:
251 #end if 305 ylimUp = $comparison_plots_opt.ylimUp,
306 #end if
307 #if $comparison_plots_opt.ylimDown:
308 ylimDown = $comparison_plots_opt.ylimDown,
309 #end if
310 x.axis.size = $comparison_plots_opt.x_axis_size,
311 y.axis.size = $comparison_plots_opt.y_axis_size,
312 dot.size = $comparison_plots_opt.dot_size,
313 text.size = $comparison_plots_opt.text_size,
314 text.angle = $comparison_plots_opt.text_angle,
315 legend.size = $comparison_plots_opt.legend_size,
316 ProteinName = $comparison_plots_opt.ProteinName,
317 colorkey = $comparison_plots_opt.colorkey,
318 numProtein = $comparison_plots_opt.numProtein,
319 clustering = "$comparison_plots_opt.clustering",
320 width = $comparison_plots_opt.width,
321 height = $comparison_plots_opt.height,
322 #if $comparison_plots_opt.which_Protein.select != 'list'
323 which.Protein = "$comparison_plots_opt.which_Protein.select",
324 #else
325 which.Protein = unlist(read.table("$comparison_plots_opt.which_Protein.protein_list", sep = "\n", header = FALSE), use.names = FALSE),
326 #end if
327 #if $comparison_plots_opt.which_Comparison.select != 'list'
328 which.Comparison = "$comparison_plots_opt.which_Comparison.select",
329 #else
330 which.Comparison = unlist(read.table("$comparison_plots_opt.which_Comparison.comparison_list", sep = "\n", header = FALSE), use.names = FALSE),
331 #end if
332 address="MSStats_group_")
333
334
335 #end if
336 #end for
252 337
253 #end if 338 #end if
254 ]]></configfile> 339 ]]></configfile>
255 </configfiles> 340 </configfiles>
256 <inputs> 341 <inputs>
257 <conditional name="input"> 342 <conditional name="input">
258 <param name="input_src" type="select" label="input source"> 343 <param name="input_src" type="select" label="input source">
259 <option value="MSstats">MStats 10 column format</option> 344 <option value="MSstats">MStats 10 column format</option>
260 <option value="MaxQuant">MaxQuant</option> 345 <option value="MaxQuant">MaxQuant</option>
261 <!--
262 <option value="OpenMS">OpenMS</option> 346 <option value="OpenMS">OpenMS</option>
263 -->
264 <option value="OpenSWATH">OpenSWATH</option> 347 <option value="OpenSWATH">OpenSWATH</option>
348 <!--option value="DIAUmpire">DIA-Umpire</option-->
349 <option value="Skyline">Skyline</option>
265 </param> 350 </param>
266 <when value="MSstats"> 351 <when value="MSstats">
267 <param name="msstats_input" type="data" format="tabular,csv" label="MSstats 10-column input"/> 352 <param name="msstats_input" type="data" format="tabular,csv" label="MSstats 10-column input"/>
268 </when> 353 </when>
269 <when value="MaxQuant"> 354 <when value="MaxQuant">
270 <param name="evidence" type="data" format="tabular,csv" label="evidence.txt - feature-level data"/> 355 <param name="evidence" type="data" format="tabular,csv" label="evidence.txt - feature-level data"/>
271 <param name="annotation" type="data" format="tabular,csv" label="annotation.txt data which includes Raw.file, Condition, BioReplicate, Run, IsotopeLabelType information"/> 356 <param name="proteinGroups" type="data" format="tabular,csv" optional="True" label="proteinGroups.txt - protein-level data" help="It needs to match protein group ID. If not selected use Proteins in 'evidence.txt'"/>
272 <param name="proteinGroups" type="data" format="tabular,csv" label="proteinGroups.txt" help="It needs to matching protein group ID. If proteinGroups=NULL, use 'Proteins' column in 'evidence.txt'"/> 357 <param name="annotation" type="data" format="tabular,csv" label="annotation file" help="Columns: Raw.file, Condition (the name of the condition is not allowed to start with a number or contain any special characters.), BioReplicate, Run, IsotopeLabelType information"/>
358
273 <param name="proteinID" type="select" label="Select Protein ID in evidence.txt"> 359 <param name="proteinID" type="select" label="Select Protein ID in evidence.txt">
274 <option value="Proteins">Protein column</option> 360 <option value="Proteins">Protein column</option>
275 <option value="Leading.razor.protein">Leading razor protein column</option> 361 <option value="Leading.razor.protein">Leading razor protein column</option>
276 </param> 362 </param>
277 <section name="input_options" title="MaxQtoMSstatsFormat Options" expanded="false"> 363 <section name="input_options" title="MaxQtoMSstatsFormat Options" expanded="false">
278 <expand macro="useUniquePeptide"/> 364 <expand macro="useUniquePeptide"/>
279 <expand macro="summaryforMultipleRows"/> 365 <expand macro="summaryforMultipleRows"/>
280 <expand macro="fewMeasurements"/> 366 <expand macro="fewMeasurements"/>
281 <param name="removeMpeptides" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="true" label="Remove the peptides including 'M' sequence"/> 367 <param name="removeMpeptides" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="false" label="Remove the peptides including 'M' sequence"/>
282 <param name="removeOxidationMpeptides" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="true" label="Remove the peptides including Oxidized 'M' sequence"/> 368 <param name="removeOxidationMpeptides" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="false" label="Remove the peptides including Oxidized 'M' sequence"/>
283 <expand macro="removeProtein_with1Peptide"/> 369 <expand macro="removeProtein_with1Peptide"/>
284 </section> 370 </section>
285 </when> 371 </when>
286 <!--
287 <when value="OpenMS"> 372 <when value="OpenMS">
288 <param name="evidence" type="data" format="tabular,csv" label="OpenSWATH_input"/> 373 <param name="openms_input" type="data" format="tabular,csv" label="OpenMS input (e.g. output of MSstatsConverter)"/>
289 <param name="annotation" type="data" format="tabular,csv" label="OpenSWATH_annotation"/> 374 <param name="annotation" type="data" format="tabular,csv" optional="true" label="If annotation is not yet complete in OpenMS, use annotation with Raw.file, Condition (the name of the condition is not allowed to start with a number or contain any special characters), BioReplicate, and Runinformation"/>
290 <section name="input_options" title="MaxQtoMSstatsFormat Options" expanded="false"> 375 <section name="input_options" title="OpenMStoMSstatsFormat Options" expanded="false">
291 <expand macro="useUniquePeptide"/> 376 <expand macro="useUniquePeptide"/>
292 <expand macro="summaryforMultipleRows"/> 377 <expand macro="summaryforMultipleRows"/>
293 <expand macro="fewMeasurements"/> 378 <expand macro="fewMeasurements"/>
294 <expand macro="removeProtein_with1Peptide"/> 379 <param name="removeProtein_with1Feature" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="false" label="Remove the proteins which have only 1 peptide and charge"/>
295 </section> 380 </section>
296 </when> 381 </when>
297 -->
298 <when value="OpenSWATH"> 382 <when value="OpenSWATH">
299 <param name="evidence" type="data" format="tabular,csv" label="OpenSWATH_input"/> 383 <param name="openswath_input" type="data" format="tabular,csv" label="OpenSWATH_input"/>
300 <param name="annotation" type="data" format="tabular,csv" label="OpenSWATH_annotation"/> 384 <param name="annotation" type="data" format="tabular,csv" label="annotation file"/>
301 <section name="input_options" title="OpenSWATHtoMSstatsFormat Options" expanded="false"> 385 <section name="input_options" title="OpenSWATHtoMSstatsFormat Options" expanded="false">
302 <param name="filter_with_mscore" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="true" label="Remove the peptides including 'M' sequence"/> 386 <param name="filter_with_mscore" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="true" label="Remove the peptides including 'M' sequence"/>
303 <param name="mscore_cutoff" type="float" value="0.01" min="0" max="1.0" label="mscore_cutoff"/> 387 <param name="mscore_cutoff" type="float" value="0.01" min="0" max="1.0" label="m_score cutoff"/>
304 <expand macro="useUniquePeptide"/> 388 <expand macro="useUniquePeptide"/>
305 <expand macro="fewMeasurements"/> 389 <expand macro="fewMeasurements"/>
306 <expand macro="summaryforMultipleRows"/> 390 <expand macro="summaryforMultipleRows"/>
307 <param name="removeProtein_with1Feature" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="false" label="Remove the proteins which have only 1 peptide and charge"/> 391 <param name="removeProtein_with1Feature" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="false" label="Remove the proteins which have only 1 peptide and charge"/>
308 </section> 392 </section>
309 </when> 393 </when>
310 </conditional> 394 <when value="Skyline">
395 <param name="skyline_input" type="data" format="tabular,csv" label="Skyline input"/>
396 <param name="annotation" type="data" optional="true" format="tabular,csv" label="annotation file"/>
397 <section name="input_options" title="SkylinetoMSstatsFormat Options" expanded="false">
398 <param name="removeiRT" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="true" label="Remove iRT" help="Yes (default) will remove the proteins or peptides which are labeld ’iRT’ in ’StandardType’ column. No will keep them."/>
399 <param name="filter_with_Qvalue" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="true" label="Filter with Qvalue" help="Yes (default) will filter out the intensities that have greater than qvalue_cutoff in Detection QValue column. Those intensities will be replaced with zero and will be considered as censored missing values for imputation purpose."/>
400 <param name="qvalue_cutoff" type="float" value="0.01" min="0" max="1.0" label="Cutoff for Detection QValue."/>
401 <expand macro="removeProtein_with1Peptide"/>
402 <expand macro="useUniquePeptide"/>
403 <expand macro="fewMeasurements"/>
404 <param name="removeOxidationMpeptides" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="false" label="Remove Oxidation M peptides" help="Yes will remove the peptides including ’oxidation (M)’ in modification."/>
405 <param name="removeProtein_with1Feature" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="false" label="Remove proteins with 1 feature" help="Yes will remove the proteins which have only 1 peptide and charge."/>
406 </section>
407 </when>
408 </conditional>
409
311 <section name="dp_options" title="dataProcess Options" expanded="false"> 410 <section name="dp_options" title="dataProcess Options" expanded="false">
312 <param name="logTrans" type="select" label="Log-transform Variable ABUNDANCE with base:" help="(logTrans)"> 411 <param name="logTrans" type="select" label="logarithm transformation of intensities with base 2 or 10." help="Intensities for original intensity between 0 and 1 will be replaced with zero value after normalization.">
313 <option value="2" selected="true">2</option> 412 <option value="2" selected="true">2</option>
314 <option value="10">10</option> 413 <option value="10">10</option>
315 </param> 414 </param>
316 <conditional name="norm"> 415 <conditional name="norm">
317 <param name="normalization" type="select" label="Normalization to remove systematic bias between MS runs"> 416 <param name="normalization" type="select" label="Normalization to remove systematic bias between MS runs">
318 <option value="equalizeMedians" selected="true">equalizeMedians - represents constant normalization</option> 417 <option value="equalizeMedians" selected="true">equalizeMedians - represents constant normalization</option>
319 <option value="quantile">quantile - quantile normalization</option> 418 <option value="quantile">quantile - quantile normalization</option>
320 <option value="globalStandards">globalStandards - normalization with global standards proteins</option> 419 <option value="globalStandards">globalStandards - normalization with global standards proteins</option>
321 <option value="FALSE">no normalization is performed</option> 420 <option value="FALSE">false - no normalization is performed</option>
322 </param> 421 </param>
323 <when value="equalizeMedians"/> 422 <when value="equalizeMedians"/>
324 <when value="quantile"/> 423 <when value="quantile"/>
325 <when value="globalStandards"> 424 <when value="globalStandards">
326 <param name="nameStandards" type="text" value="" label="global standard peptide names"> 425 <param name="nameStandards" type="text" value="" label="global standard peptide names" help="Peptide names should be double-quoted and separated by commas">
327 <help>peptide names should be double-quoted and separated by commas</help>
328 <validator type="empty_field" /> 426 <validator type="empty_field" />
329 <validator type="regex" message="double-quoted names separated by commas"><![CDATA[^".+"(,".+")*$]]></validator> 427 <validator type="regex" message="double-quoted names separated by commas"><![CDATA[^".+"(,".+")*$]]></validator>
330 </param> 428 </param>
331 </when> 429 </when>
332 <when value="FALSE"/> 430 <when value="FALSE"/>
333 </conditional> 431 </conditional>
334 <param name="fillIncompleteRows" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="true" label="Fill Incomplete Rows" help=" If the input dataset has incomplete rows, TRUE (default) adds the rows with intensity value=NA for missing peaks. FALSE reports error message with list of features which have incomplete rows"/> 432 <param name="fillIncompleteRows" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="true" label="Fill Incomplete Rows" help="If the input dataset has incomplete rows, 'Yes' (default) adds the rows with intensity value=NA for missing peaks. 'No' reports error message with list of features which have incomplete rows"/>
335 <conditional name="features"> 433 <conditional name="features">
336 <param name="featureSubset" type="select" label="Features to use"> 434 <param name="featureSubset" type="select" label="Feature Subset">
337 <option value="all" selected="true">Use all features that the data set has</option> 435 <option value="all" selected="true">Use all features that the data set has</option>
338 <option value="top3">Use the top 3 features which have highest average of log2(intensity) across runs</option> 436 <option value="top3">Use the top 3 features which have highest average of log2(intensity) across runs</option>
339 <option value="topN">Use the top N features which have highest average of log2(intensity) across runs</option> 437 <option value="topN">Use the top N features which have highest average of log2(intensity) across runs</option>
340 <option value="highQuality">Flag uninformative feature and outliers</option> 438 <option value="highQuality">High quality: Flag uninformative feature and outliers</option>
341 </param> 439 </param>
342 <when value="all"/> 440 <when value="all"/>
343 <when value="top3"/> 441 <when value="top3"/>
344 <when value="topN"> 442 <when value="topN">
345 <param name="n_top_feature" type="integer" value="3" min="1" label="The number of top features for featureSubset"/> 443 <param name="n_top_feature" type="integer" value="3" min="1" label="The number of top features for Feature Subset"/>
346 </when> 444 </when>
347 <when value="highQuality"> 445 <when value="highQuality">
348 <param name="remove_uninformative_feature_outlier" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="false" label="Remove features flagged with Uninformative feature_quality"/> 446 <param name="remove_uninformative_feature_outlier" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="false" label="Remove features flagged with uninformative feature quality"/>
349 </when> 447 </when>
350 </conditional> 448 </conditional>
351 <conditional name="summarize"> 449 <conditional name="summarize">
352 <param name="summaryMethod" type="select" label="Summary Method"> 450 <param name="summaryMethod" type="select" label="Summary Method">
353 <option value="TMP" selected="true">TMP - Tukey's median polish</option> 451 <option value="TMP" selected="true">TMP - Tukey's median polish</option>
354 <option value="linear" selected="true">linear - linear mixed model</option> 452 <option value="linear" selected="true">linear - linear mixed model</option>
355 </param> 453 </param>
356 <when value="TMP"> 454 <when value="TMP">
357 <param name="MBimpute" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="true" label="Impute Missing Values 'NA' or '0' (depending on censoredInt option) by Accelated failure model" help="(MBimpute) TRUE - inserts 'NA' or '0' (depending on censoredInt option), . FALSE uses the values assigned by cutoffCensored"/> 455 <param name="MBimpute" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="true" label="Impute Missing Values" help="Yes: inserts 'NA' or '0' (depending on censored intensity), No: uses the values assigned by cutoff value for censoring"/>
358 <param name="remove50missing" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="false" label="Remove runs which have more than 50% missing values"/> 456 <param name="remove50missing" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="false" label="Remove runs which have more than 50% missing values"/>
359 </when> 457 </when>
360 <when value="linear"> 458 <when value="linear">
361 <param name="equalFeatureVar" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="false" label="Account for heterogeneous variation among intensities from different features" help="(equalFeatureVar) TRUE assumes equal variance among intensities from features. FALSE means that we cannot assume equal variance among intensities from features, then we will account for heterogeneous variation from different features"/> 459 <param name="equalFeatureVar" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="false" label="Account for heterogeneous variation among intensities from different features" help="Yes: assumes equal variance among intensities from features. No: means that we cannot assume equal variance among intensities from features, then we will account for heterogeneous variation from different features"/>
362 </when> 460 </when>
363 </conditional> 461 </conditional>
364 <param name="censoredInt" type="select" label="Missing values to censor"> 462 <param name="censoredInt" type="select" label="Censored intensity">
365 <help>The output from Skyline and Progenesis should use '0'</help> 463 <help>The processing tools report missing values differently. This option is for distinguishwhich value should be considered as missing, and further whether it is censored or at random. Skyline and OpenSWATH input should use '0'. MaxQuant input should use 'NA'</help>
366 <option value="NA" selected="true">Assume that all 'NA's in 'Intensity' column are censored</option> 464 <option value="NA" selected="true">NA - Assume that all 'NA's in 'Intensity' column are censored</option>
367 <option value="0">Use zero intensities '0' as censored intensity</option> 465 <option value="0">0 - Use zero intensities '0' as censored intensity</option>
368 <option value="NULL">Assume all NA intensites are randomly missing</option> 466 <option value="NULL">NULL - Assume all NA intensites are randomly missing</option>
369 </param> 467 </param>
370 <param name="cutoffCensored" type="select" label="Cutoff value for censoring"> 468 <param name="cutoffCensored" type="select" label="Cutoff value for censoring">
371 <option value="minFeature" selected="true">minimum value for each feature</option> 469 <option value="minFeature" selected="true">minimum value for each feature</option>
372 <option value="minRun">minimum value for each run</option> 470 <option value="minRun">minimum value for each run</option>
373 <option value="minFeatureNRun">smallest between minimum value of corresponding feature and minimum value of corresponding run</option> 471 <option value="minFeatureNRun">smallest between minimum value of corresponding feature and minimum value of corresponding run</option>
374 </param> 472 </param>
375 <param name="maxQuantileforCensored" type="float" value="0.999" min="0.75" max="1.0" label="Maximum quantile for deciding censored missing values"/> 473 <param name="maxQuantileforCensored" type="float" value="0.999" min="0" max="1.0" label="Maximum quantile for deciding censored missing values."/>
376 </section> 474 </section>
377 <param name="selected_outputs" type="select" multiple="true" optional="false" label="Select outputs"> 475 <param name="selected_outputs" type="select" multiple="true" optional="false" label="Select outputs">
378 <option value="log" selected="true">MSstats log</option> 476 <option value="log" selected="true">MSstats log</option>
379 <option value="r_script" selected="false">MSstats Rscript</option> 477 <option value="r_script" selected="false">MSstats Rscript</option>
380 <option value="processed_data" selected="true">MSstats ProcessedData</option> 478 <option value="processed_data" selected="true">MSstats ProcessedData</option>
381 <option value="runlevel_data" selected="false">MSstats RunlevelData</option> 479 <option value="runlevel_data" selected="false">MSstats RunlevelData</option>
382 <option value="qcplot" selected="true">MSstats QCPlot.pdf</option> 480 <option value="QCPlot" selected="true">MSstats QCPlot</option>
383 <option value="profile_plot" selected="false">MSstats ProfilePlot.pdf</option> 481 <option value="ProfilePlot" selected="false">MSstats ProfilePlot</option>
384 <option value="profile_wsum_plot" selected="false">MSstats ProfilePlot_wSummarization.pdf</option> 482 <option value="profile_wsum_plot" selected="false">MSstats ProfilePlot_wSummarization</option>
385 <option value="condition_plot" selected="false">MSstats ConditionPlot.pdf</option> 483 <option value="ConditionPlot" selected="false">MSstats ConditionPlot</option>
386 <option value="quant_sample_matrix" selected="false">Sample Quantification Matrix Table</option> 484 <option value="quant_sample_matrix" selected="false">Sample Quantification Matrix Table</option>
387 <option value="quant_sample_long" selected="false">Sample Quantification Long Table</option> 485 <option value="quant_sample_long" selected="false">Sample Quantification Long Table</option>
388 <option value="quant_group_matrix" selected="true">Group Quantification Matrix Table</option> 486 <option value="quant_group_matrix" selected="true">Group Quantification Matrix Table</option>
389 <option value="quant_group_long" selected="false">Group Quantification Long Table</option> 487 <option value="quant_group_long" selected="false">Group Quantification Long Table</option>
390 </param> 488 </param>
391 489 <section name="out_plots_opt" title="DataProcess Plot Options" expanded="false">
490 <param name="featureName" type="select" display="radio" label="Feature name for Profile Plot" help="Transition means printing feature legend intransition-level; Peptide means printing feature legend in peptide-level; NA means no feature legend printing.">
491 <option value="Transition" selected="true">Transition</option>
492 <option value="Peptide">Peptide</option>
493 <option value="NA">NA</option>
494 </param>
495 <param name="ylimUp" type="float" optional="true" label="For all three plots, upper limit for y-axis." help="Empty (default) for Profile Plot and QC Plot uses the upper limit as rounded off maximum of log2(intensities) after normalization + 3; for Condition Plot maximum of log ratio + SD or CI. Alternatively, insert specific value of y-axis limit."/>
496 <param name="ylimDown" type="float" optional="true" label="For all tree plots, lower limit for y-axis in the log scale" help="Empty (default) for Profile Plot and QCPlot uses 0; for Condition Plot is minimum of log ratio - SD or CI. Alternatively, insert specific value of lower y-axis limit. "/>
497
498 <param name="scale" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="false" label="Scale for Condition Plot" help=" No (Default) means each conditional level is not scaled at x-axis according to its actual value (equal space at x-axis). Yes means each conditional level is scaled at x-axis according to its actual value (unequal space at x-axis)."/>
499 <param name="interval" type="select" display="radio" label="Interval for Condition Plot" help="CI (default) uses confidence interval with 0.95 significant level for the width of error bar. SD uses standard deviation for the width of error bar.">
500 <option value="CI" selected="true">CI - confidence interval</option>
501 <option value="SD">SD - standard deviation</option>
502 </param>
503 <param name="x_axis_size" type="integer" min="1" value="10" label="Size of x-axis labeling for 'Run' in Profile Plot and QC Plot, and 'Condition' in Condition Plot"/>
504 <param name="y_axis_size" type="integer" min="1" value="10" label="Size of y-axis labeling"/>
505 <param name="text_size" type="integer" min="1" value="4" label="Size of labeling for feature names in normal QQPlots separately for each feature and size of labels represented each condition at the top of graph in Profile Plot and QC plot."/>
506 <param name="text_angle" type="integer" min="0" max="360" value="90" label="Angle of labels represented each condition at the top of graph in Profile Plot and QC plot or x-axis labeling in Condition plot."/>
507 <param name="legend_size" type="integer" min="1" value="7" label="Size of feature names in residual plots and feature legend (transition-level or peptide-level) above graph in Profile Plot. "/>
508 <param name="dot_size_profile" type="integer" min="1" value="2" label="Size of dots in Profile plot"/>
509 <param name="dot_size_condition" type="integer" min="1" value="3" label="Size of dots in Condition plot"/>
510 <param name="width" type="integer" min="1" value="8" label="Width of the saved pdf file"/>
511 <param name="height" type="integer" min="1" value="5" label="Height of the saved pdf file"/>
512 <conditional name="which_Protein">
513 <param name="select" type="select" label="Select protein IDs to draw plots">
514 <option value="all" selected="true">generate all plots for each protein</option>
515 <option value="allonly">Option for QC plot: "allonly" will generate one QC plot with all proteins</option>
516 <option value="list">Protein IDs as tabular input</option>
517 </param>
518 <when value="all"/>
519 <when value="allonly"/>
520 <when value="list">
521 <param name="protein_list" type="data" format="tabular" label="List of proteins"/>
522 </when>
523 </conditional>
524 <param name="remove_uninformative_feature_outlier" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="false" label="Remove uninformative feature outlier in profile plots" help="It only works after when feature subset high Quality was used in dataProcess options. Yes allows to remove 1) the features are flagged in the column, feature_quality=Uninformative which are features with bad quality, 2) outliers that are flagged in the column, is_outlier=TRUE in profile plots. No (default) shows all features and intensities in profile plots."/>
525 </section>
392 <conditional name="group"> 526 <conditional name="group">
393 <param name="group_comparison" type="select" label="Compare Groups"> 527 <param name="group_comparison" type="select" label="Compare Groups">
394 <option value="no">No</option> 528 <option value="no">No</option>
395 <option value="yes">Yes</option> 529 <option value="yes">Yes</option>
396 </param> 530 </param>
397 <when value="no"/> 531 <when value="no"/>
398 <when value="yes"> 532 <when value="yes">
399 <param name="comparison_matrix" type="data" format="tabular,csv" label="Comparison Matrix"/> 533 <param name="comparison_matrix" type="data" format="tabular,csv" label="Comparison Matrix"/>
400 <param name="select_outputs" type="select" multiple="true" label="Select outputs"> 534 <param name="select_outputs" type="select" multiple="true" label="Select outputs">
401 <help>Heatmap requires more than one comparison</help> 535 <help>Heatmap requires more than one comparison</help>
402 <option value="fittedmodel" selected="true">MSstats ComparisonFittedModel.txt</option> 536 <option value="fittedmodel" selected="false">MSstats ComparisonFittedModel.txt</option>
403 <option value="comparison_result" selected="true">MSstats ComparisonResult.tsv</option> 537 <option value="comparison_result" selected="true">MSstats ComparisonResult.tsv</option>
404 <option value="model_qc" selected="false">MSstats ModelQC.tsv</option> 538 <option value="model_qc" selected="false">MSstats ModelQC.tsv</option>
405 <option value="qqplot" selected="false">MSstats QQPlot.pdf</option> 539 <option value="QQPlots" selected="false">MSstats QQPlot</option>
406 <option value="residualplot" selected="false">MSstats ResidualPlot.pdf</option> 540 <option value="ResidualPlots" selected="false">MSstats ResidualPlot</option>
407 <option value="volcanoplot" selected="true">MSstats VolcanoPlot.pdf</option> 541 <option value="VolcanoPlot" selected="true">MSstats VolcanoPlot</option>
408 <option value="heatmap" selected="false">MSstats Heatmap.pdf</option> 542 <option value="Heatmap" selected="false">MSstats Heatmap</option>
409 <option value="comparisonplot" selected="true">MSstats ComparisonPlot.pdf</option> 543 <option value="ComparisonPlot" selected="true">MSstats ComparisonPlot</option>
410 </param> 544 </param>
411 </when> 545 </when>
412 </conditional> 546 </conditional>
547 <section name="comparison_plots_opt" title="Comparison Plot Options" expanded="false">
548 <param name="sig" type="float" min="0" max="1" value="0.05" label="FDR cutoff for the adjusted p-values in heatmap and volcano plot" help="Level of significance for comparison plot. 100(1-sig)% confidence interval will be drawn."/>
549 <param name="FCcutoff" type="float" optional="true" label="Involve fold change cutoff or not for volcano plot or heatmap." help="Empty (default) means no fold change cutoff is applied for significance analysis. Specific value means specific fold change cutoff is applied"/>
550 <param name="logBase_pvalue" type="select" label="For volcano plot or heatmap, logarithm transformation of adjusted p-valuewith base 2 or 10">
551 <option value="2">2</option>
552 <option value="10" selected="true">10</option>
553 </param>
554 <param name="ylimUp" type="float" optional="true" label="For all three plots, upper limit for y-axis." help="Empty (default) for volcano plot/heatmap use maximum of -log2 (adjusted p-value) or -log10 (adjusted p-value), for comparison plot uses maximum of log-fold change + CI. Alternatively, insert specific value of y-axis limit. "/>
555 <param name="ylimDown" type="float" optional="true" label="For all tree plots, lower limit for y-axis in the log scale" help="Empty (default) for volcano plot/heatmap use minimum of -log2 (adjusted p-value) or -log10 (adjusted p-value), for comparison plot uses minimum of log-fold change - CI. Alternatively, insert specific value of y-axis limit. "/>
556 <param name="xlimUp" type="float" optional="true" label="For Volcano plot, the limit for x-axis" help="Empty (default) for use maximum for absolute value of log-fold change or 3 as default if maximum for absolute value of log-fold change is less than 3. Alternatively, insert specific value of y-axis limit."/>
557 <param name="axis_size" type="integer" min="1" value="10" label="Size of axes labels for Residual and QQ Plots"/>
558 <param name="x_axis_size" type="integer" min="1" value="10" label="Size of x-axis labeling"/>
559 <param name="y_axis_size" type="integer" min="1" value="10" label="Size of y-axis labeling"/>
560 <param name="dot_size" type="integer" min="1" value="3" label="Size of dots in residual plots, QQPlots, volcano plot and comparison plot."/>
561 <param name="text_size" type="integer" min="1" value="4" label="Size of Protein Name label in the graph for Volcano Plot."/>
562 <param name="text_angle" type="integer" min="0" max="360" value="90" label="Angle of x-axis labels represented each comparison at the bottom of graph incomparison plot."/>
563 <param name="legend_size" type="integer" min="1" value="7" label="Size of legend for color at the bottom of volcano plot. "/>
564 <param name="ProteinName" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="true" label="Display protein names in Volcano Plot." help="Yes (default) means protein names, which are significant, are displayed next to the points. No means no protein names are displayed."/>
565 <param name="colorkey" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="true" label="Show colour key"/>
566 <param name="numProtein" type="integer" min="1" value="100" max="180" label="Number of proteins which will be presented in each heatmap."/>
567 <param name="clustering" type="select" label="Determines how to order proteins and comparisons. Hierarchical cluster analysis with Ward method(minimum variance) is performed.">
568 <help>’protein’ means that protein dendrogram is computed and reordered based on protein means (the order of row is changed). ’comparison’ means comparison dendrogram is computed and reordered based on comparison means (the order of comparison is changed). ’both’ means to reorder both protein and comparison.</help>
569 <option value="protein" selected="true">protein</option>
570 <option value="comparison">comparison</option>
571 <option value="both">both</option>
572 </param>
573 <param name="width" type="integer" min="1" value="8" label="Width of the saved pdf file"/>
574 <param name="height" type="integer" min="1" value="5" label="Height of the saved pdf file"/>
575 <conditional name="which_Protein">
576 <param name="select" type="select" label="Select protein IDs to draw plots">
577 <option value="all" selected="true">generate all plots for each protein</option>
578 <option value="list">Protein IDs as tabular input</option>
579 </param>
580 <when value="all"/>
581 <when value="list">
582 <param name="protein_list" type="data" format="tabular" label="List of proteins"/>
583 </when>
584 </conditional>
585 <conditional name="which_Comparison">
586 <param name="select" type="select" label="Select comparisons to draw plots">
587 <option value="all" selected="true">Generate all plots for each comparison</option>
588 <option value="list">Comparison names as tabular input</option>
589 </param>
590 <when value="all"/>
591 <when value="list">
592 <param name="comparison_list" type="data" format="tabular" label="List of comparisons"/>
593 </when>
594 </conditional>
595 </section>
413 </inputs> 596 </inputs>
414
415 <outputs> 597 <outputs>
416 <data name="log" format="txt" label="MSstats log"> 598 <data name="log" format="txt" label="${tool.name} on ${on_string}: MSstats log">
417 <filter>'log' in selected_outputs</filter> 599 <filter>'log' in selected_outputs</filter>
418 </data> 600 </data>
419 <data name="r_script" format="txt" label="MSstats Rscript"> 601 <data name="r_script" format="txt" label="${tool.name} on ${on_string}: Rscript">
420 <filter>'r_script' in selected_outputs</filter> 602 <filter>'r_script' in selected_outputs</filter>
421 </data> 603 </data>
422 <data name="processed_data" format="tabular" label="MSstats ProcessedData" from_work_dir="ProcessedData.tsv"> 604 <data name="processed_data" format="tabular" label="${tool.name} on ${on_string}: ProcessedData" from_work_dir="ProcessedData.tsv">
423 <filter>'processed_data' in selected_outputs</filter> 605 <filter>'processed_data' in selected_outputs</filter>
424 <actions> 606 <actions>
425 <action name="column_names" type="metadata" default="PROTEIN,PEPTIDE,TRANSITION,FEATURE,LABEL,GROUP_ORIGINAL,SUBJECT_ORIGINAL,RUN,GROUP,SUBJECT,INTENSITY,SUBJECT_NESTED,ABUNDANCE,FRACTION,originalRUN,censored" /> 607 <action name="column_names" type="metadata" default="PROTEIN,PEPTIDE,TRANSITION,FEATURE,LABEL,GROUP_ORIGINAL,SUBJECT_ORIGINAL,RUN,GROUP,SUBJECT,INTENSITY,SUBJECT_NESTED,ABUNDANCE,FRACTION,originalRUN,censored" />
426 </actions> 608 </actions>
427 </data> 609 </data>
428 <data name="runlevel_data" format="tabular" label="MSstats RunlevelData" from_work_dir="RunlevelData.tsv"> 610 <data name="runlevel_data" format="tabular" label="${tool.name} on ${on_string}: RunlevelData" from_work_dir="RunlevelData.tsv">
429 <filter>'runlevel_data' in selected_outputs</filter> 611 <filter>'runlevel_data' in selected_outputs</filter>
430 <actions> 612 <actions>
431 <action name="column_names" type="metadata" default="RUN,Protein,LogIntensities,NumMeasuredFeature,MissingPercentage,more50missing,NumImputedFeature,originalRUN,GROUP,GROUP_ORIGINAL,SUBJECT_ORIGINAL,SUBJECT_NESTED,SUBJECT" /> 613 <action name="column_names" type="metadata" default="RUN,Protein,LogIntensities,NumMeasuredFeature,MissingPercentage,more50missing,NumImputedFeature,originalRUN,GROUP,GROUP_ORIGINAL,SUBJECT_ORIGINAL,SUBJECT_NESTED,SUBJECT" />
432 </actions> 614 </actions>
433 </data> 615 </data>
434 <data name="qcplot" format="pdf" label="MSstats QCPlot.pdf" from_work_dir="MSStats_only_QCPlot.pdf"> 616 <data name="QCPlot" format="pdf" label="${tool.name} on ${on_string}: QCPlot" from_work_dir="MSStats_only_QCPlot.pdf">
435 <filter>'qcplot' in selected_outputs</filter> 617 <filter>'QCPlot' in selected_outputs</filter>
436 </data> 618 </data>
437 <data name="profile_plot" format="pdf" label="MSstats ProfilePlot.pdf" from_work_dir="MSStats_only_ProfilePlot.pdf"> 619 <data name="ProfilePlot" format="pdf" label="${tool.name} on ${on_string}: Profile Plot" from_work_dir="MSStats_only_ProfilePlot.pdf">
438 <filter>'profile_plot' in selected_outputs</filter> 620 <filter>'ProfilePlot' in selected_outputs</filter>
439 </data> 621 </data>
440 <data name="profile_wsum_plot" format="pdf" label="MSstats ProfilePlot_wSummarization.pdf" from_work_dir="MSStats_only_ProfilePlot_wSummarization.pdf"> 622 <data name="profile_wsum_plot" format="pdf" label="${tool.name} on ${on_string}: Profile Plot with Summarization" from_work_dir="MSStats_only_ProfilePlot_wSummarization.pdf">
441 <filter>'profile_wsum_plot' in selected_outputs</filter> 623 <filter>'profile_wsum_plot' in selected_outputs</filter>
442 </data> 624 </data>
443 <data name="condition_plot" format="pdf" label="MSstats ConditionPlot.pdf" from_work_dir="MSStats_only_ConditionPlot.pdf"> 625 <data name="ConditionPlot" format="pdf" label="${tool.name} on ${on_string}: Condition Plot" from_work_dir="MSStats_only_ConditionPlot.pdf">
444 <filter>'condition_plot' in selected_outputs</filter> 626 <filter>'ConditionPlot' in selected_outputs</filter>
445 </data> 627 </data>
446 <data name="quant_sample_matrix" format="tabular" label="MSstats SampleQuantificationMatrix.tsv" from_work_dir="SampleQuantificationMatrix.tsv"> 628 <data name="quant_sample_matrix" format="tabular" label="${tool.name} on ${on_string}: Sample Quantification Matrix" from_work_dir="SampleQuantificationMatrix.tsv">
447 <filter>'quant_sample_matrix' in selected_outputs</filter> 629 <filter>'quant_sample_matrix' in selected_outputs</filter>
448 </data> 630 </data>
449 <data name="quant_sample_long" format="tabular" label="MSstats SampleQuantificationLong.tsv" from_work_dir="SampleQuantificationLong.tsv"> 631 <data name="quant_sample_long" format="tabular" label=" ${tool.name} on ${on_string}:Sample Quantification Long" from_work_dir="SampleQuantificationLong.tsv">
450 <filter>'quant_sample_long' in selected_outputs</filter> 632 <filter>'quant_sample_long' in selected_outputs</filter>
451 <actions> 633 <actions>
452 <action name="column_names" type="metadata" default="Protein,Group_Subject,LogIntensity" /> 634 <action name="column_names" type="metadata" default="Protein,Group_Subject,LogIntensity" />
453 </actions> 635 </actions>
454 </data> 636 </data>
455 <data name="quant_group_matrix" format="tabular" label="MSstats GroupQuantificationMatrix.tsv" from_work_dir="GroupQuantificationMatrix.tsv"> 637 <data name="quant_group_matrix" format="tabular" label="${tool.name} on ${on_string}: Group Quantification Matrix" from_work_dir="GroupQuantificationMatrix.tsv">
456 <filter>'quant_group_matrix' in selected_outputs</filter> 638 <filter>'quant_group_matrix' in selected_outputs</filter>
457 </data> 639 </data>
458 <data name="quant_group_long" format="tabular" label="MSstats GroupQuantificationLong.tsv" from_work_dir="GroupQuantificationLong.tsv"> 640 <data name="quant_group_long" format="tabular" label="${tool.name} on ${on_string}: Group Quantification Long" from_work_dir="GroupQuantificationLong.tsv">
459 <filter>'quant_group_long' in selected_outputs</filter> 641 <filter>'quant_group_long' in selected_outputs</filter>
460 <actions> 642 <actions>
461 <action name="column_names" type="metadata" default="Protein,Group,LogIntensity" /> 643 <action name="column_names" type="metadata" default="Protein,Group,LogIntensity" />
462 </actions> 644 </actions>
463 </data> 645 </data>
464 <data name="comparison_result" format="tabular" label="MSstats ComparisonResult.tsv" from_work_dir="ComparisonResult.tsv"> 646 <data name="comparison_result" format="tabular" label="${tool.name} on ${on_string}: Comparison Result" from_work_dir="ComparisonResult.tsv">
465 <filter> group['group_comparison'] == 'yes' and 'comparison_result' in group['select_outputs']</filter> 647 <filter> group['group_comparison'] == 'yes' and 'comparison_result' in group['select_outputs']</filter>
466 <actions> 648 <actions>
467 <action name="column_names" type="metadata" default="Protein,Label,log2FC,SE,Tvalue,DF,pvalue,adj.pvalue,issue,MissingPercentage,ImputationPercentage" /> 649 <action name="column_names" type="metadata" default="Protein,Label,log2FC,SE,Tvalue,DF,pvalue,adj.pvalue,issue,MissingPercentage,ImputationPercentage" />
468 </actions> 650 </actions>
469 </data> 651 </data>
470 <data name="fittedmodel" format="txt" label="MSstats ComparisonFittedModel.txt" from_work_dir="ComparisonFittedModel.txt"> 652 <data name="fittedmodel" format="txt" label="${tool.name} on ${on_string}: Comparison Fitted Model" from_work_dir="ComparisonFittedModel.txt">
471 <filter> group['group_comparison'] == 'yes' and 'fittedmodel' in group['select_outputs']</filter> 653 <filter> group['group_comparison'] == 'yes' and 'fittedmodel' in group['select_outputs']</filter>
472 </data> 654 </data>
473 <data name="model_qc" format="tabular" label="MSstats ModelQC.tsv" from_work_dir="ModelQC.tsv"> 655 <data name="model_qc" format="tabular" label="${tool.name} on ${on_string}: Model QC" from_work_dir="ModelQC.tsv">
474 <filter> group['group_comparison'] == 'yes' and 'model_qc' in group['select_outputs']</filter> 656 <filter> group['group_comparison'] == 'yes' and 'model_qc' in group['select_outputs']</filter>
475 <actions> 657 <actions>
476 <action name="column_names" type="metadata" default="RUN,PROTEIN,ABUNDANCE,NumMeasuredFeature,MissingPercentage,more50missing,NumImputedFeature,originalRUN,GROUP,GROUP_ORIGINAL,SUBJECT_ORIGINAL,SUBJECT_NESTED,SUBJECT,residuals,fitted" /> 658 <action name="column_names" type="metadata" default="RUN,PROTEIN,ABUNDANCE,NumMeasuredFeature,MissingPercentage,more50missing,NumImputedFeature,originalRUN,GROUP,GROUP_ORIGINAL,SUBJECT_ORIGINAL,SUBJECT_NESTED,SUBJECT,residuals,fitted" />
477 </actions> 659 </actions>
478 </data> 660 </data>
479 <data name="qqplot" format="pdf" label="MSstats ModelQQ.pdf" from_work_dir="MSStats_group_QQPlot.pdf"> 661 <data name="QQPlots" format="pdf" label="${tool.name} on ${on_string}: Model QQ" from_work_dir="MSStats_group_QQPlot.pdf">
480 <filter> group['group_comparison'] == 'yes' and 'qqplot' in group['select_outputs']</filter> 662 <filter> group['group_comparison'] == 'yes' and 'QQPlots' in group['select_outputs']</filter>
481 </data> 663 </data>
482 <data name="residualplot" format="pdf" label="MSstats ResidualPlot.pdf" from_work_dir="MSStats_group_ResidualPlot.pdf"> 664 <data name="ResidualPlots" format="pdf" label="${tool.name} on ${on_string}: Residual Plot" from_work_dir="MSStats_group_ResidualPlot.pdf">
483 <filter> group['group_comparison'] == 'yes' and 'residualplot' in group['select_outputs']</filter> 665 <filter> group['group_comparison'] == 'yes' and 'ResidualPlots' in group['select_outputs']</filter>
484 </data> 666 </data>
485 <data name="volcanoplot" format="pdf" label="MSstats VolcanoPlot.pdf" from_work_dir="MSStats_group_VolcanoPlot.pdf"> 667 <data name="VolcanoPlot" format="pdf" label="${tool.name} on ${on_string}:Volcano Plot" from_work_dir="MSStats_group_VolcanoPlot.pdf">
486 <filter> group['group_comparison'] == 'yes' and 'volcanoplot' in group['select_outputs']</filter> 668 <filter> group['group_comparison'] == 'yes' and 'VolcanoPlot' in group['select_outputs']</filter>
487 </data> 669 </data>
488 <data name="heatmap" format="pdf" label="MSstats Heatmap.pdf" from_work_dir="MSStats_group_Heatmap.pdf"> 670 <data name="Heatmap" format="pdf" label="${tool.name} on ${on_string}: Heatmap" from_work_dir="MSStats_group_Heatmap.pdf">
489 <filter> group['group_comparison'] == 'yes' and 'heatmap' in group['select_outputs']</filter> 671 <filter> group['group_comparison'] == 'yes' and 'Heatmap' in group['select_outputs']</filter>
490 </data> 672 </data>
491 <data name="comparisonplot" format="pdf" label="MSstats ComparisonPlot.pdf" from_work_dir="MSStats_group_ComparisonPlot.pdf"> 673 <data name="ComparisonPlot" format="pdf" label="${tool.name} on ${on_string}: Comparison Plot" from_work_dir="MSStats_group_ComparisonPlot.pdf">
492 <filter> group['group_comparison'] == 'yes' and 'comparisonplot' in group['select_outputs']</filter> 674 <filter> group['group_comparison'] == 'yes' and 'ComparisonPlot' in group['select_outputs']</filter>
493 </data> 675 </data>
494 </outputs> 676 </outputs>
495 <tests> 677 <tests>
496
497 <test> 678 <test>
498 <conditional name="input"> 679 <conditional name="input">
499 <param name="input_src" value="MSstats"/> 680 <param name="input_src" value="MSstats"/>
500 <param name="msstats_input" ftype="csv" value="msstats_testfile.txt"/> 681 <param name="msstats_input" ftype="csv" value="msstats_testfile.txt"/>
501 </conditional> 682 </conditional>
502 <param name="selected_outputs" value="processed_data,profile_plot,profile_wsum_plot,quant_sample_matrix,quant_group_long"/> 683 <param name="selected_outputs" value="processed_data,ProfilePlot,profile_wsum_plot,quant_sample_matrix,quant_group_long"/>
503 <output name="processed_data"> 684 <output name="processed_data">
504 <assert_contents> 685 <assert_contents>
505 <has_text text="D.GPLTGTYR" /> 686 <has_text text="D.GPLTGTYR" />
506 <has_n_columns n="16" /> 687 <has_n_columns n="16" />
507 <has_n_lines n="2071" /> 688 <has_n_lines n="2071" />
519 <has_text text="LogIntensity" /> 700 <has_text text="LogIntensity" />
520 <has_n_columns n="3" /> 701 <has_n_columns n="3" />
521 <has_n_lines n="37" /> 702 <has_n_lines n="37" />
522 </assert_contents> 703 </assert_contents>
523 </output> 704 </output>
524 <output name="profile_plot" file="MSstats ProfilePlot.pdf" compare="sim_size"/> 705 <output name="ProfilePlot" file="MSstats ProfilePlot.pdf" compare="sim_size"/>
525 <output name="profile_wsum_plot" file="profile_wsum_plot.pdf" compare="sim_size"/> 706 <output name="profile_wsum_plot" file="profile_wsum_plot.pdf" compare="sim_size"/>
526 </test> 707 </test>
527 708
528 <test> 709 <test>
529 <conditional name="input"> 710 <conditional name="input">
532 </conditional> 713 </conditional>
533 <conditional name="group"> 714 <conditional name="group">
534 <param name="group_comparison" value="yes"/> 715 <param name="group_comparison" value="yes"/>
535 <param name="comparison_matrix" ftype="csv" value="comparison_matrix.csv"/> 716 <param name="comparison_matrix" ftype="csv" value="comparison_matrix.csv"/>
536 </conditional> 717 </conditional>
537 <param name="select_outputs" value="residualplot,model_qc"/> 718 <param name="select_outputs" value="ResidualPlots,model_qc"/>
538 <output name="processed_data"> 719 <output name="processed_data">
539 <assert_contents> 720 <assert_contents>
540 <has_text text="D.GPLTGTYR" /> 721 <has_text text="D.GPLTGTYR" />
541 <has_n_columns n="16" /> 722 <has_n_columns n="16" />
542 <has_n_lines n="2071" /> 723 <has_n_lines n="2071" />
547 <has_text text="MissingPercentage" /> 728 <has_text text="MissingPercentage" />
548 <has_n_columns n="15" /> 729 <has_n_columns n="15" />
549 <has_n_lines n="108" /> 730 <has_n_lines n="108" />
550 </assert_contents> 731 </assert_contents>
551 </output> 732 </output>
552 <output name="residualplot" file="residual_plot.pdf" compare="sim_size"/> 733 <output name="ResidualPlots" file="residual_plot.pdf" compare="sim_size"/>
553 </test> 734 </test>
554 735
555 <test> 736 <test>
556 <conditional name="input"> 737 <conditional name="input">
557 <param name="input_src" value="MaxQuant"/> 738 <param name="input_src" value="MaxQuant"/>
558 <param name="evidence" ftype="tabular" value="test_MQ_evidence.tabular"/> 739 <param name="evidence" ftype="tabular" value="test_MQ_evidence.tabular"/>
559 <param name="annotation" ftype="tabular" value="test_MQ_annotation.txt"/> 740 <param name="annotation" ftype="tabular" value="test_MQ_annotation.txt"/>
560 <param name="proteinGroups" ftype="tabular" value="test_MQ_proteingroups.tabular"/> 741 <param name="proteinGroups" ftype="tabular" value="test_MQ_proteingroups.tabular"/>
561 </conditional> 742 </conditional>
562 <param name="selected_outputs" value="condition_plot,processed_data,runlevel_data"/> 743 <param name="selected_outputs" value="ConditionPlot,processed_data,runlevel_data"/>
563 <conditional name="group"> 744 <conditional name="group">
564 <param name="group_comparison" value="yes"/> 745 <param name="group_comparison" value="yes"/>
565 <param name="comparison_matrix" ftype="csv" value="test_MQ_group12_comparison_matrix.csv"/> 746 <param name="comparison_matrix" ftype="csv" value="test_MQ_group12_comparison_matrix.csv"/>
566 </conditional> 747 </conditional>
567 <param name="select_outputs" value="qqplot,comparison_result"/> 748 <param name="select_outputs" value="QQPlots,comparison_result"/>
568 <output name="processed_data"> 749 <output name="processed_data">
569 <assert_contents> 750 <assert_contents>
570 <has_text text="SPILVATAVAAR" /> 751 <has_text text="SPILVATAVAAR" />
571 <has_n_columns n="16" /> 752 <has_n_columns n="16" />
572 <has_n_lines n="57" /> 753 <has_n_lines n="61" />
573 </assert_contents> 754 </assert_contents>
574 </output> 755 </output>
575 <output name="runlevel_data"> 756 <output name="runlevel_data">
576 <assert_contents> 757 <assert_contents>
577 <has_text text="qx017084.raw.thermo" /> 758 <has_text text="qx017084.raw.thermo" />
584 <has_text text="r2-r1" /> 765 <has_text text="r2-r1" />
585 <has_n_columns n="11" /> 766 <has_n_columns n="11" />
586 <has_n_lines n="4" /> 767 <has_n_lines n="4" />
587 </assert_contents> 768 </assert_contents>
588 </output> 769 </output>
589 <output name="condition_plot" file="condition_plot.pdf" compare="sim_size"/> 770 <output name="ConditionPlot" file="condition_plot.pdf" compare="sim_size"/>
590 <output name="qqplot" file="qq_plot.pdf" compare="sim_size"/> 771 <output name="QQPlots" file="qq_plot.pdf" compare="sim_size"/>
591 </test> 772 </test>
592 773
593 <!--
594 <test> 774 <test>
595 <conditional name="input"> 775 <conditional name="input">
596 <param name="input_src" value="OpenMS"/> 776 <param name="input_src" value="OpenMS"/>
597 <param name="evidence" ftype="tabular" value=""/> 777 <param name="openms_input" ftype="tabular" value="openms_input.tabular"/>
598 <param name="annotation" ftype="tabular" value=""/> 778 </conditional>
779 <param name="selected_outputs" value="ConditionPlot,processed_data,runlevel_data"/>
780 <conditional name="group">
781 <param name="group_comparison" value="yes"/>
782 <param name="comparison_matrix" ftype="tabular" value="openms_comparisonmatrix.tabular"/>
783 <param name="select_outputs" value="Heatmap"/>
599 </conditional> 784 </conditional>
600 <output name="processed_data"> 785 <output name="processed_data">
601 <assert_contents> 786 <assert_contents>
602 <has_text text="D.GPLTGTYR" /> 787 <has_text text="AAAPGIQLVAGEGFQSPLEDR_2_NA_0" />
603 </assert_contents> 788 <has_text text="sp|P09938|RIR2_YEAST" />
604 </output> 789 <has_n_columns n="16" />
790 <has_n_lines n="121" />
791 </assert_contents>
792 </output>
793 <output name="runlevel_data">
794 <assert_contents>
795 <has_text text="sp|P09457|ATPO_YEAST" />
796 <has_n_columns n="13" />
797 <has_n_lines n="76" />
798 </assert_contents>
799 </output>
800 <output name="ConditionPlot" file="condition_plot_openms.pdf" compare="sim_size"/>
801 <output name="Heatmap" file="Heatmap_openms.pdf" compare="sim_size"/>
605 </test> 802 </test>
606 --> 803 <test>
804 <conditional name="input">
805 <param name="input_src" value="Skyline"/>
806 <param name="skyline_input" ftype="csv" value="skyline_input_first100.csv"/>
807 <param name="annotation" ftype="csv" value="skyline_annotations.csv"/>
808 <param name="removeProtein_with1Peptide" value="TRUE"/>
809 </conditional>
810 <conditional name="summarize">
811 <param name="MBimpute" value="FALSE"/>
812 <param name="censoredInt" value="NULL"/>
813 </conditional>
814 <param name="selected_outputs" value="log,ProfilePlot,processed_data,quant_sample_long"/>
815 <param name="featureName" value="Peptide"/>
816 <param name="width" value="10"/>
817 <param name="height" value="7"/>
818 <conditional name="group">
819 <param name="group_comparison" value="yes"/>
820 <param name="comparison_matrix" ftype="tabular" value="comparison_matrix_skyline.tabular"/>
821 <param name="select_outputs" value="VolcanoPlot,ComparisonPlot,comparison_result"/>
822 </conditional>
823 <param name="FCcutoff" value="2" />
824 <conditional name="which_Comparison">
825 <param name="select" value="list"/>
826 <param name="comparison_list" ftype="tabular" value="comparison_list_skyline.tabular"/>
827 </conditional>
828 <output name="quant_sample_long">
829 <assert_contents>
830 <has_text text="P32125" />
831 <has_text text="Condition5_5" />
832 <has_n_columns n="3" />
833 <has_n_lines n="6" />
834 </assert_contents>
835 </output>
836 <output name="log">
837 <assert_contents>
838 <has_text text="ADVGFLC" />
839 <has_text text="1 level of Isotope type labeling in this experiment" />
840 <has_text text="The required input : provided - okay" />
841 </assert_contents>
842 </output>
843 <output name="processed_data">
844 <assert_contents>
845 <has_text text="ADVGFLC[+57]NMLER_2_sum_NA" />
846 <has_text text="319070944" />
847 <has_n_columns n="15" />
848 <has_n_lines n="46" />
849 </assert_contents>
850 </output>
851 <output name="comparison_result">
852 <assert_contents>
853 <has_text text="c1-c4" />
854 <has_text text="log2FC" />
855 <has_n_lines n="4" />
856 </assert_contents>
857 </output>
858 <output name="ProfilePlot" file="Profile_plot_skyline.pdf" compare="sim_size"/>
859 <output name="VolcanoPlot" file="Volcano_plot_skyline.pdf" compare="sim_size"/>
860 <output name="ComparisonPlot" file="Comparison_plot_skyline.pdf" compare="sim_size"/>
861 </test>
607 862
608 <test> 863 <test>
609 <conditional name="input"> 864 <conditional name="input">
610 <param name="input_src" value="OpenSWATH"/> 865 <param name="input_src" value="OpenSWATH"/>
611 <param name="evidence" ftype="tabular" value="test_swath_input_data.tabular"/> 866 <param name="openswath_input" ftype="tabular" value="test_swath_input_data.tabular"/>
612 <param name="annotation" ftype="tabular" value="test_swath_annotations.tabular"/> 867 <param name="annotation" ftype="tabular" value="test_swath_annotations.tabular"/>
613 </conditional> 868 </conditional>
614 <output name="processed_data"> 869 <output name="processed_data">
615 <assert_contents> 870 <assert_contents>
616 <has_text text="GETLGLIGFGR" /> 871 <has_text text="GETLGLIGFGR" />
617 <has_n_columns n="16" /> 872 <has_n_columns n="16" />
618 <has_n_lines n="253" /> 873 <has_n_lines n="253" />
619 </assert_contents> 874 </assert_contents>
620 </output> 875 </output>
621 <output name="qcplot" file="QC_plot.pdf" compare="sim_size"/> 876 <output name="QCPlot" file="QC_plot.pdf" compare="sim_size"/>
622 </test> 877 </test>
623 878
624 <test> 879 <test>
625 <conditional name="input"> 880 <conditional name="input">
626 <param name="input_src" value="OpenSWATH"/> 881 <param name="input_src" value="OpenSWATH"/>
627 <param name="evidence" ftype="tabular" value="test_swath_input_data.tabular"/> 882 <param name="openswath_input" ftype="tabular" value="test_swath_input_data.tabular"/>
628 <param name="annotation" ftype="tabular" value="test_swath_annotations.tabular"/> 883 <param name="annotation" ftype="tabular" value="test_swath_annotations.tabular"/>
629 </conditional> 884 </conditional>
630 <param name="selected_outputs" value="r_script,processed_data,quant_sample_long"/> 885 <param name="selected_outputs" value="r_script,processed_data,quant_sample_long"/>
631 <conditional name="group"> 886 <conditional name="group">
632 <param name="group_comparison" value="yes"/> 887 <param name="group_comparison" value="yes"/>
633 <param name="comparison_matrix" ftype="csv" value="test_swath_group12_comparison_matrix.csv"/> 888 <param name="comparison_matrix" ftype="csv" value="test_swath_group12_comparison_matrix.csv"/>
634 </conditional> 889 </conditional>
635 <param name="select_outputs" value="comparison_result,volcanoplot,residualplot"/> 890 <param name="select_outputs" value="comparison_result,VolcanoPlot,ResidualPlots"/>
636 <output name="processed_data"> 891 <output name="processed_data">
637 <assert_contents> 892 <assert_contents>
638 <has_text text="GETLGLIGFGR" /> 893 <has_text text="GETLGLIGFGR" />
639 <has_n_columns n="16" /> 894 <has_n_columns n="16" />
640 <has_n_lines n="253" /> 895 <has_n_lines n="253" />
652 <has_text text="Q5VYK3" /> 907 <has_text text="Q5VYK3" />
653 <has_n_columns n="11" /> 908 <has_n_columns n="11" />
654 <has_n_lines n="6" /> 909 <has_n_lines n="6" />
655 </assert_contents> 910 </assert_contents>
656 </output> 911 </output>
657 <output name="volcanoplot" file="volcanoplot.pdf" compare="sim_size"/> 912 <output name="VolcanoPlot" file="volcanoplot.pdf" compare="sim_size"/>
658 <output name="residualplot" file="residualplot.pdf" compare="sim_size"/> 913 <output name="ResidualPlots" file="residualplot.pdf" compare="sim_size"/>
659 </test> 914 </test>
660 915
661 </tests> 916 </tests>
662 <help><![CDATA[ 917 <help><![CDATA[
663 MSstats is an open-source R package for statistical relative quantification of proteins and peptides in global, targeted and data-independent proteomics. `More information on MSstats <http://msstats.org/>`_ 918 MSstats is an open-source R package for statistical relative quantification of proteins and peptides in global, targeted and data-independent proteomics. `More information on MSstats <http://msstats.org/>`_
666 921
667 ----- 922 -----
668 923
669 **Input data** 924 **Input data**
670 925
671 - Data in tabular or csv format, generated by spectral processing tools such as `MaxQuant <http://coxdocs.org/doku.php?id=maxquant:start/>`_, `OpenSWATH <http://openswath.org/en/latest/>`_ will be automatically converted to 10-column MSstats format 926 - Data in tabular or csv format, either in the 10-column MSstats format or the outputs of spectral processing tools such as `MaxQuant <http://coxdocs.org/doku.php?id=maxquant:start/>`_, `OpenSWATH <http://openswath.org/en/latest/>`_
672 927
673 - MaxQuant format: evidence.txt, proteinGroups.txt 928 - MSstats format: tabular file with 10 column either manually curated or other sources such as Swath2stats tool which is implemented in Pyprophet export in Galaxy. For manual curation: Names of headers are fixed but not case sensitive:
674 - OpenSWATH format: pyprophet export file
675 - MSstats format: tabular file with 10 column either manually curated or other sources such as swath2stats tool which is implemented in Pyprophet export in Galaxy. For manual curation: Names of headers are fixed but not case sensitive:
676 929
677 - ProteinName: protein ID or peptide ID for peptide-level modeling and analysis; statistical analysis will be done separately for each unique label in this column 930 - ProteinName: protein ID or peptide ID for peptide-level modeling and analysis; statistical analysis will be done separately for each unique label in this column
678 - PeptideSequence: Amino acid sequence for each peptides. If the peptide sequences should be distinguished based on post-translational modifications, this column can be renamed to PeptideModifiedSequence. 931 - PeptideSequence: Amino acid sequence for each peptide. If the peptide sequences should be distinguished based on post-translational modifications, this column can be renamed to PeptideModifiedSequence.
679 - PrecursorCharge: charge state of precursor. 932 - PrecursorCharge: charge state of precursor.
680 - FragmentIon: e.g. b4, y3, if unknown use a single value for all entries. 933 - FragmentIon: e.g. b4, y3, if unknown use a single value for all entries.
681 - ProductCharge: charge state of product. If unknown use 0 for all entries. 934 - ProductCharge: charge state of product. If unknown use 0 for all entries.
682 - IsotopeLabelType: This column indicates whether this measurement is based on the endogenous peptides (use “L”) or labeled reference peptides (use “H”). 935 - IsotopeLabelType: This column indicates whether this measurement is based on the endogenous peptides (use “L”) or labeled reference peptides (use “H”).
683 - Condition: For group comparison experiments, this column indicates groups of interest (such as “Disease” or “Control”). For time-course experiments, this column indicates time points (such as “T1”, “T2”, etc). If the experimental design contains both distinct groups of subjects and multiple time points per subject, this column should indicate a combination of these values (such as “Disease_T1”, “Disease_T2”, “Control_T1”, “Control_T2”, etc.). 936 - Condition: For group comparison experiments, this column indicates groups of interest (such as “Disease” or “Control”). The name of the condition is not allowed to start with a number or contain any special characters. For time-course experiments, this column indicates time points (such as “T1”, “T2”, etc). If the experimental design contains both distinct groups of subjects and multiple time points per subject, this column should indicate a combination of these values (such as “Disease_T1”, “Disease_T2”, “Control_T1”, “Control_T2”, etc.).
684 - BioReplicate: This column should contain a unique identifier for each biological replicate in the experiment. For example, in a clinical proteomic investigation this should be a unique patient id. Patients from distinct groups should have distinct ids. MSstats does not require the presence of technical replicates in the experiment. If the technical replicates are present, all samples or runs from a same biological replicate should have a same id. MSstats automatically detects the presence of technical replicates and accounts for them in the model-based analysis. 937 - BioReplicate: This column should contain a unique identifier for each biological replicate in the experiment. For example, in a clinical proteomic investigation this should be a unique patient id. Patients from distinct groups should have distinct ids. MSstats does not require the presence of technical replicates in the experiment. If the technical replicates are present, all samples or runs from a same biological replicate should have a same id. MSstats automatically detects the presence of technical replicates and accounts for them in the model-based analysis.
685 - Run: This column contains the identifier of a mass spectrometry run. Each mass spectrometry run should have a unique identifier, regardless of the origin of the biological sample. In SRM experiments, if all the transitions of a biological or a technical replicate are split into multiple “methods” due to the technical limitations, each method should have a separate identifier. When processed by Skyline, distinct values of runs correspond to distinct input file names. It is possible to use the actual input file names as values in the column Run. 938 - Run: This column contains the identifier of a mass spectrometry run. Each mass spectrometry run should have a unique identifier, regardless of the origin of the biological sample. In SRM experiments, if all the transitions of a biological or a technical replicate are split into multiple “methods” due to the technical limitations, each method should have a separate identifier. When processed by Skyline, distinct values of runs correspond to distinct input file names. It is possible to use the actual input file names as values in the column Run.
686 - Intensity: This column should contain the quantified signal of a feature in a run without any transformation (in particular, no logarithm transform). The signals can be quantified as the peak height or the peak of area under curve. Any other quantitative representation of abundance can also be used. 939 - Intensity: This column should contain the quantified signal of a feature in a run without any transformation (in particular, no logarithm transform). The signals can be quantified as the peak height or the peak of area under curve. Any other quantitative representation of abundance can also be used.
687 - Example file header: 940 - Example file header:
688 :: 941 ::
693 P02768 ETYGEMADCCAK 2 b3 0 946 P02768 ETYGEMADCCAK 2 b3 0
694 P02768 ETYGEMADCCAK 2 b4 0 947 P02768 ETYGEMADCCAK 2 b4 0
695 ... ... ... ... ... 948 ... ... ... ... ...
696 949
697 isotopelabeltype condition bioreplicate run intensity 950 isotopelabeltype condition bioreplicate run intensity
698 L 1 ReplA 1 4298.12 951 L disease ReplA 1 4298.12
699 H 1 ReplA 1 1974.59 952 H disease ReplA 1 1974.59
700 L 1 ReplA 1 7183.22 953 L disease ReplA 1 7183.22
701 H 1 ReplA 1 8467.58 954 H disease ReplA 1 8467.58
702 ... ... ... ... ... 955 ... ... ... ... ...
703 956
957 - MaxQuant format: evidence.txt, proteinGroups.txt; plus externally generated annotation file
958 - OpenSWATH format: pyprophet export file; plus externally generated annotation file
704 959
705 - Annotations as tabular file are needed for all input options except MSstats format 960 - Annotations as tabular file are needed for all input options except MSstats format
706 961
707 - 4 columns with exactly these headers: Raw.file, Condition, BioReplicate, Run; additional 5th column only for MaxQuant: IsotopeLabelType 962 - 4 columns with exactly these headers: Raw.file, Condition, BioReplicate, Run; additional 5th column only for MaxQuant: IsotopeLabelType
708 963
709 - Raw.file: File name that has to match exactly as it appears in the other input files (e.g. S1207.raw.thermo; in/AA12_mzML.mzML) 964 - Raw.file:
710 - all other columns: see description above for MSstats format columns 965
966 - OpenSWATH: File name needs to fit exactly how it is written in OpenSwatch output (e.g. "in/AA12_mzML.mzML")
967 - MaxQuant: File name needs to fit to how it is written in MaxQuant output, but the ".raw" has to be removed (e.g. "file1.raw.thermo.raw" --> "file1.raw.thermo")
968 - Condition: The name of the condition is not allowed to start with a number or contain any special characters
969 - All other columns: see description above for MSstats format columns
711 970
712 - Comparison matrix as tabular file 971 - Comparison matrix as tabular file
713 972
714 - 1st column: name of comparison 973 - 1st column: name of comparison
715 - additionally one column for each condition that is present in the tabular file. Use 1 and -1 to indicate the conditions to compare and 0 for conditions that are not compared. Multiple groups can be combined by using 0.5. 974 - Additionally one column for each condition that is present in the tabular file. Use 1 and -1 to indicate the conditions to compare and 0 for conditions that are not compared. Multiple groups can be combined by using 0.5.
716 - first row contains the names of the groups, they must exactly match the condition name used in the annotation file 975 - First row contains the names of the groups, they must exactly match the condition name used in the annotation file
717 - each additional row represents one comparison 976 - Each additional row represents one comparison
718 - Example for a two group comparison 977 - Example for a two group comparison
719 978
720 :: 979 ::
721 980
722 names groupA groupB 981 names groupA groupB
733 G3-G5 0 0 -1 0 1 992 G3-G5 0 0 -1 0 1
734 G1+G2-G5 0.5 0.5 0 0 -1 993 G1+G2-G5 0.5 0.5 0 0 -1
735 994
736 **Options** 995 **Options**
737 996
738 - data conversion from MaxQuant and OpenSWATH to MSstats format: 997 - Data conversion from MaxQuant and OpenSWATH to MSstats format:
739 998
740 - MaxQuant input: + Contaminant, + Reverse, + Only.identified.by.site, proteins are automatically removed during conversion 999 - MaxQuant input: Contaminants and reverse and only identified by site from MaxQuant tool are automatically removed during conversion
741 1000
742 - data processing options: 1001 - Data processing options:
743 1002
744 - MaxQuant input: Contaminants and reverse and only ID by site) from MaxQuant tool are automatically removed; 1003 - Log transformation: log2 or log10 transformation of intensities
745 - log transformation 1004 - Normalization of MS runs: If there are multiple fractionations or injections for one sample, normalization is performed by each fractionation or different m/z range from multiple injections.
746 - normalization of MS runs 1005
1006 - equalizeMedians: The default option for normalization is equalizeMedians, where all intensities in a run are shifted by a constant, to equalize the median of intensities across runs for label-free experiment. This normalization method is appropriate when we can assume that the majority of proteins do not change across runs. Be cautious when using the equalizeMedians option for a label-free DDA dataset with only a small number of proteins. For label based experiment, equalizeMedians equalizes the median of reference intensities across runs and is generally proper even for a dataset with a small number of proteins.
1007 - globalStandards: If you have a spiked in standard, you may set this option to define the standard with name Standardsoption.
1008 - quantile: The distribution of all the intensities in each run will become the same across runs for label-free experiment. For label-based experiment, the distribution of all the reference intensities will become the same across runs and all the endogenous intensities are shifted by a constant corresponding to reference intensities.
1009 - FALSE: No normalization is performed. If you had your own normalization before MSstats use this option.
1010
747 - Feature selection 1011 - Feature selection
1012
1013 - all: Use all features in the dataset.
1014 - top3: Use top 3 features which have highest average of log(intensity) across runs.
1015 - topN: Use top N (specify number) features which have highest average of log(intensity) across runs.
1016 - highQuality: Detect and flag uninformative features (as Uninformative in the feature_quality column) and outliers (as TRUE in the is_outliercolumn). These uninformative content may be excluded from run-level summarization by setting the remove features flagged with uninformative feature quality option to TRUE.
1017
1018 - Summarizing intensities per MS run
1019
1020 - TMP: Tukey’s median polish. Robust parameter estimation method with median across rows and columns. Prerequisite for missing value imputation.
1021 - linear: Linear model (lmfunction). Average-based summarization.
1022
748 - Missing value imputation: 1023 - Missing value imputation:
749 1024
750 - MaxQuant input: All missing values are NA, usecensoredInt must be 'NA' 1025 - Impute Missing Values: Only possible for Summarization Method TMP. Censored missing values will be determined (by censored intensity; cutoff value for censoring and Maximum quantile for deciding censored missing values") and imputed by Accelerated Failure Time model.
751 - OpenSWATH input: secensoredInt must be '0' 1026
752 - Summary method: TMP + censoredInt = NULL: It assumes that all intensities are missing at random, therefore no action with MBimpute = FALSE or error with MBimpute = TRUE 1027 - Remove runs which have more than 50% missing values: Yes or no.
753 - censoredInt='NA'or'0'& MBimpute=TRUE: AFT model-based imputation usingcutoffCensoredvalue in the AFT model 1028 - Account for heterogeneous variation among intensities from different features: Yes: assumes equal variance among intensities from features. No: means that we cannot assume equal variance among intensities from features, then we will account for heterogeneous variation from different features
754 - censoredInt='NA'or'0'&MBimpute=FALSE: censored intensities (hereNA’s) will be replaced withthe value specified incutoffCensored. 1029 - Censored Intensity: The processing tools report missing values differently. This option is for distinguishwhich value should be considered as missing, and further whether it is censored or at random
755 - Summarizing intensities per MS run 1030
756 - group comparison: automatic detection of differentially abundant proteins between two conditions, conditions have to be specified with the 'comparison matrix' 1031 - NA - It assumes that all NAs in Intensity column are censored.
757 - quantification per sample or group 1032 - 0 - It assumes that all values between 0 and 1 in Intensity column are censored. If there areNAs inIntensitywith this option, NAs will be considered as random missing.
758 1033 - NULL - It assumes that all missing values are randomly missing.
759 - sample: relative protein abundance in each biological replicate. If there are technical replicates for biological replicates,sample quantification will be the median among technical replicates. If there is no technical replicate for biological replicate (sample), sample quantification will be the same as run-level summarization. 1034 - Skyline and OpenSWATH input should use '0'. MaxQuant input should use 'NA'
760 - group: relative protein abundance in each condition, summarized over the biological replicates (median among sample quantification). In presence of completely missing values in a condition, the estimates will be zero 1035 - Cutoff value for censoring: cutoff for AFT model; only with censored intensity 'NA' or '0'; if NULL it assumes that there is no censored missing and any imputation will not be performed. In case that there are completely missing measurements in a run for a protein, any imputation will not be performed. In addition, the condition, which has no measurement at all in a protein, will be not impute.
1036
1037 - minimum value for each feature: cutoff for AFT model will be the minimum value for each feature across runs. With this option, those runs with substantial missing measurements will be biased by the cutoff value. In such case, you may remove the runs that have more than 50% missing values from the analysis.
1038 - minimum value for each run: cutoff for AFT model will be the minimum value for each run across features
1039 - smallest between minimum value of corresponding feature and minimum value of corresponding run: cutoff for AFT model will be the smallest value between minimum valueof corresponding feature and minimum value of corresponding run
1040 - Maximum quantile for deciding censored missing values: If you don’t want to apply the threshold of noise intensity in your data, you can use maxQuantileforCensored=NULL.
1041 - Missing value imputation combination with summarization method TMP:
1042
1043 - Summarization method: TMP + censored intensity: 'NULL': It assumes that all intensities are missing at random, therefore no action with missing value imputation: No; or error with missing value imputation: Yes.
1044 - Missing value imputation: Yes + censored intensity:'NA' or '0': AFT model-based imputation using cutoff value for censoring in the AFT model
1045 - Missing value imputation: No + censored intensity:'NA' or '0': censored intensities will be replaced with the value specified in cutoff value for censoring
1046
1047 - Group comparison: automatic detection of differentially abundant proteins between two conditions, conditions have to be specified with the 'comparison matrix'
1048 - Quantification per sample or group: choose the corresponding output option
1049
1050 - Sample: relative protein abundance in each biological replicate. If there are technical replicates for biological replicates,sample quantification will be the median among technical replicates. If there is no technical replicate for biological replicate (sample), sample quantification will be the same as run-level summarization.
1051 - Group: relative protein abundance in each condition, summarized over the biological replicates (median among sample quantification). In presence of completely missing values in a condition, the estimates will be zero
1052
761 1053
762 **Output options** 1054 **Output options**
763 1055
764 - Different outputs available. Especially for studies with many proteins, it is suggested to select only the necessary pdf outputs as many of them generate one plot per protein. 1056 - Different outputs available. Especially for studies with many proteins, it is suggested to select only the necessary pdf outputs as many of them generate one plot per protein.
765 1057
766 - MSstats log - check log file for warnings and information on the analysis steps (txt) 1058 - MSstats log - check log file for warnings and information on the analysis steps (txt)
767 - r-script - can be used to re-run analysis outside Galaxy (txt) 1059 - MSstats Rscript - can be used to re-run analysis outside Galaxy or to inspect the executed code (txt)
768 - processed_data - transformed, normalized, imputed intensities (tabular) 1060 - MSstats ProcessedData - transformed, normalized, imputed intensities (tabular)
769 - runlevel_data - summarized intensities per run (tabular) 1061
770 - qcplot - log2 intensity boxplot for all proteins and run on first page, followed by one boxplot per protein (pdf) 1062 - Intensity column: includes original intensities values
771 - profile_plot - log2 intensity profiles one plot per protein and run (pdf) 1063 - Abundance column: contains the log2 transformed and normalized intensities and it will used for run-level summarization
772 - profile_wsum_plot - log2 intensity profiles one plot per protein and run with run summarization (pdf) 1064 - Censored column: has the decision about censored missing or not, based on censored Intensity and maximum quantile for deciding censored missing values options. Abundances with TRUE value in censored column will be considered as censored missing and imputed when Missing value imputation: Yes.
773 - condition_plot - log2 intensity range for each protein and condition (pdf) 1065
774 - quant_sample_matrix - relative protein abundance in each biological replicate (tabular) 1066 - MSstats RunlevelData - run and protein level summarized data (tabular)
775 - quant_sample_long - relative protein abundance in each biological replicate, long format (tabular) 1067
776 - quant_group_matrix - relative protein abundance in each condition (tabular) 1068 - LogIntensities: log intensity summarized per run and protein, they will be used for the group comparison and summarized profile plot
777 - quant_group_long - relative protein abundance in each condition, long format (tabular) 1069 - NumMeasuredFeature: shows how many features were used for summarization of the corresponding run and protein
778 - comparison_result - summary of statistical results per protein and comparison (tabular) 1070 - MissingPercentage: percentage of random and censoredmissing in the corresponding run and protein out of the total number of feature in the corresponding protein.
779 - model_qc - summary statistics per run (tabular) 1071 - more50missing: whether MissingPercentage is greater than 50% or not
780 - qqplot - one QQplot per protein (pdf) 1072 - NumImputedFeatures: how many features were imputed in the corresponding run and protein
781 - residualplot - one residual plot per protein (pdf) 1073
782 - volcanoplot - one volcano plot per comparison (pdf) 1074 - MSstats QCPlot - log2 intensity boxplot for all proteins and run on first page, followed by one boxplot per protein (pdf)
783 - heatmap - needs at least 2 comparisons, one heatmap for all proteins and comparisons (pdf) 1075 - MSstats ProfilePlot - log2 intensity profiles one plot per protein and run (pdf)
784 - comparisonplot - log2 intensity range for each protein and comparison (pdf) 1076
1077 - Profile plot helps identify potential sources of variation (both variation of interest and nuisance variation) for each protein: show individual measurements for each peptide (peptide for DDA, transition for SRM orDIA) across runs, grouped per condition. Each peptide has a different color/type layout. Disconnected linesshow that there are missing value (NA).
1078
1079 - MSstats ProfilePlot_wSummarization - log2 intensity profiles one plot per protein and run with run summarization (pdf)
1080
1081 - Run-level summarized data per protein. The same peptides (or transition) in the first plot are presented in grey, with the summarized values overlaid in red.
1082
1083 - MSstats ConditionPlot - log2 intensity range for each protein and condition (pdf)
1084
1085 - Visualizes potential systematic differences in protein intensities between conditions. Dots indicate the mean of log2 intensities for each condition, error bars indicate the confidence interval with 0.95 significant level for each condition. The intervals are for descriptive purposes only.
1086
1087 - Sample Quantification Matrix/Long Table - relative protein abundance in each biological replicate in matrix (rows are proteins, and columns are combinations of biological replicate and group, filled with LogIntensities) or long format (row corresponding to relative protein abundances, and columns are Protein, Group, BioReplicate, LogIntensities) (tabular)
1088
1089 - If there are technical replicates for biological replicates, sample quantification will be the median among technical replicates. If there is no technical replicate for biological replicate (sample), sample quantification will be the same as run-level summarization. In presence of completely missing values in a biological replicate, the estimates will be zero.
1090
1091 - Group Quantification Matrix/Long Tableuant_group_matrix - relative protein abundance in each condition in matrix (rows are proteins, and columns are groups) or long format (row corresponding to relative protein abundances, and columns are Protein, Group and LogIntensities) (tabular)
1092
1093 - Outputs the estimates of relative protein abundance in each condition, summarized over the biological replicates (median among sample quantification). In presence of completely missing values in a condition, the estimates will be zero.
1094
1095 - MSstats ComparisonFittedModel (txt)
1096 - MSstats ComparisonResult - summary of statistical results per protein and comparison (tabular)
1097
1098 - Label: name of the comparison (e.g. condition1 - condition2)
1099 - log2FC: log2 fold change for the given comparison name, e.g. condition1-condition2: positive values mean more abundant in condition1, negative values mean more abundant in condition2
1100 - SE: standard error of the log2 fold change
1101 - Tvalue: test statistic of the Student test
1102 - DF: degree of freedom of the Student test
1103 - pvalue: raw p-values
1104 - adj. pvalue: adjusted p-values among all the proteins in the specific comparison
1105 - issue: shows if there is any issue for inference in corresponding protein and comparison,for example,OneConditionMissing or CompleteMissing. If one of condition for compariosn is completely missing, it would flag with OneConditionMissing with adj.pvalue=0 and log2FC=Inf or -Inf even though pvalue=NA. For example, if you want to compare ‘condition1-condition2’, but condition2 has complete missing, log2FC=Inf and adj.pvalue=0. SE,Tvalue, and pvalue will be NA. If you want to compare ‘conditions - condition2’, but condition1 has complete missing, then log2FC=-Inf and adj.pvalue=0. But, please be careful for using this log2FC and adj.pvalue.
1106
1107 - MSstats ModelQC - summary statistics per run and protein (tabular)
1108
1109 - MSstats QQPlot - one QQplot per protein (pdf)
1110
1111 - Normal quantile-quantile plots for each protein, taking as input the results of model fitting and testing in groupComparison. Only large deviations of transition intensities from the straight line are problematic and indicate that the assumption of the normal distribution of the measurement errors may not hold.
1112
1113 - MSstats ResiudalPlot - one residual plot per protein (pdf)
1114
1115 - Residual plot shows variance of the residuals that is associated with the mean feature intensity. Any specific pattern, such as increasing or decreasing by predicted abundance, is problematic and indicates that the assumption of constant variance of the measurement error may not hold.
1116
1117 - MSstats VolcanoPlot - one volcano plot per comparison (pdf)
1118
1119 - Visualizes the outcome of one comparison between conditions for all the proteins, and combine the information on statistical and practical significance. The y-axis displays the FDR-adjusted p-values on the negative log10 scale, representing statistical significance. The horizontal dashed line shows the FDR cutoff. The points above the FDR cutoff line are statistically significant proteins that are differentially abundant across conditions. These points are colored in red and blue for upregulated and downregulated proteins, respectively. The x-axis is the model-based estimate of fold change on log scale and represents practical significance. It is possible to specify a practical significance cutoff based on the estimate of fold change in addition to the statistical significance cutoff. If the fold change cutoff is specified, the points above the horizontal cutoff line but within the vertical cutoff line will be considered as not differentially abundant (and will be colored in black).
1120
1121 - MSstats Heatmap - needs at least 2 comparisons, one heatmap for all proteins and comparisons (pdf)
1122
1123 - Illustrates the patterns of up- and down-regulation of proteins in several comparisons. Columns in the heatmaps are comparison of conditions assigned in contrast matrix, and rows are proteins. The heatmaps display signed FDR-adjusted p-values of the tests, colored in red/blue for significantly up-/down-regulated proteins, while taking into account the specified FDR cutoff and the additional optional fold change cutoff. Brighter colors indicate stronger evidence in favor of differential abundance. Black color represents proteins that are not significantly differentially abundant.
1124
1125 - MSstats ComparisonPlot - log2 intensity range for each protein and comparison (pdf)
1126
1127 - Illustrates model-based estimates of log-fold changes, and the associated uncertainty, in several comparisons of conditions for one protein. X-axis is the comparison of interest. Y-axis is the log fold change. The dots are the model-based estimates of log-fold change, and the error bars are the model-based 95% confidence intervals. For simplicity, the confidence intervals are adjusted for multiple comparisons within protein only, using the Bonferroni approach. For proteins with N comparisons, the individual confidence intervals are at the level of 1-sig/N.
785 1128
786 For additional help please visit the `MSstats documentation <http://msstats.org/msstats-2/>`_ 1129 For additional help please visit the `MSstats documentation <http://msstats.org/msstats-2/>`_
787 1130
788 1131
789 ]]></help> 1132 ]]></help>