comparison README @ 7:b0ce669ce794 draft

planemo upload for repository https://github.com/HegemanLab/VKMZ commit 722cd42705f87f2dc11aa6984ae0836ad4ca41a6-dirty
author eslerm
date Thu, 20 Dec 2018 00:36:04 -0500
parents 35b984684450
children
comparison
equal deleted inserted replaced
6:35b984684450 7:b0ce669ce794
1 # VKMZ version 1.2.0 1 vkmz v1.4dev1
2 2
3 VKMZ is a metabolomics prediction and vizualization tool which creates van Krevelen diagrams from mass spectrometry data. A van Krevelen diagram (VKD) plots a molecule on a 2D scatterplot based on the molecule's oxygen to carbon ratio (O:C) against it's hydrogen to carbon ratio (H:C). Classes of metabolites cluster together on a VKD [0]. Plotting a complex mixture of metabolites on a VKD can be used to briefly convey untargeted metabolomics data. 3 vkmz predicts molecular formulas by searching a known mass-formula dictionary for a feature observed by a mass spectrometer. Elemental ratios for predicted-features are calculated to create the carbon-to-oxygen and carbon-to-hydrogen axis of a van Krevelen Diagram (VKD). VKD's are a convenient visualization tool for briefly conveying the constituents of a complex MS mixture (e.g., untargeted plant metabolomics). As output predicted-feature are saved to a tabular file, an interactive VKD web page, and other optional formats.
4
5 VKMZ attempts to predict a molecular formula for each feature in LC-MS data. Each feature's mass is compared to a database of known formula masses. A prediction is made when a known mass is within the mass error range of an feature's uncharged (neutral) mass. A binary search algorithm is used to quickly make matches. Heristically generated databases for labeled and unlabeled metabolites are included [1]. VKMZ finds all predictions for an observed mass within the mass error. The prediction with the lowest delta (absolute difference between an feature's neutral mass and the predicted mass) is plotted. Features without predictions are discarded. Outputed is saved as a tabular and html file.
6
7 This software works best with, accurate, high resolution LC-MS data. A well calibrated LC-MS is essential for correct predictions. It is best to emperically derive mass error etiher from the data or from data using the same methods and spiked standards. Using low resolution data will result in false positive predictions, especially for large mass metabolites.
8
9 VKMZ can be used as a command line tool or on the Galaxy web platform [2]. A Galaxy wrapper for VKMZ is maintatined in this repository. VKMZ was developed on the Workflow4Metabolomics version of Galaxy [3].
10
11 ## Using VKMZ command line
12
13 ### Input modes
14
15 VKMZ has two input modes:
16 1. `xcms` mode reads features from XCMS data
17 2. `tsv` mode reads a specially formatted tabular file
18
19 Select a mode by declaring it as the first argument to `vkmz.py`.
20
21 > **Example:**
22 > ```
23 > python vkmz.py xcms [other parameters]
24 > ```
25
26 Different modes allow different parameters.
27
28 ### Required parameters
29
30 #### xcms mode
31
32 xcms mode requires three tabular files generated by XCMS:
33 * `--data-matrix [XCMS_DATA_MATRIX_FILE]`
34 * `--sample-metadata [XCMS_SAMPLE_METADATAFILE]`
35 * `--variable-metadata [XCMS_VARIABLE_METADATAFILE]`
36
37 ##### xcms mode example:
38 ```
39 python vkmz.py xcms --data-matrix test-data/datamatrix.tabular --sample-metadata test-data/sampleMetadata.tabular --variable-metadata test-data/variableMetadata.tabular [other parameters]
40 ```
41
42 #### tsv mode
43
44 tsv mode requires a tabular file of a specific format as input:
45 * `--input [TSV_FILE]`
46
47 The first five columns of the input tabular file must be:
48 >| sample_id | polarity | mz | rt | intensity |
49 >|-----------|----------|----|----|-----------|
50
51
52 #### All modes
53
54 Mass error of LC-MS in parts-per-million:
55 * `--error [PPM_ERROR_NUMBER]`
56 * It is critical to set the mass error correctly
57
58 Output name:
59 * `--output [FILENAME]`
60 * A `.tsv` and `.html` file will be generated by VKMZ with the given filename
61
62 ### Optional parameters
63
64 Database:
65 * `--database [DATABASE_FILE_PATH]`
66 * Default is BMRB's monoisotopic heuristically generated database
67 * Path is relative to `--directory`
68
69 Directory:
70 * `--directory [TOOL_PATH]`
71 * Explicitly define tool directory
72 * Paths are relative if unset
73 * Affects database and web page template paths
74
75 Forced Polarity:
76 * `--polarity [positive|negative]`
77 * Set all features to have either a positive or negative polarity
78 * Overrides input files polarity information
79 * Do not use this parameter on data containing both polarities
80
81 Neutral:
82 * `--neutral`
83 * Using this flag disables charged mass adjustment
84 * Without this flag VKMZ adjusts a feature mass by adding or removing that mass of a proton based on the features charged polarity
85
86 Unique:
87 * `--unique`
88 * Remove features with multiple predictions from output
89
90 ## Special thanks to
91
92 Adrian, Art, Eric, Jerry, Kevin, Renata, Stephen, Tim, and Yuan.
93
94 ## Citations
95
96 0. Brockman et al. [doi:10.1007/s11306-018-1343-y](https://doi.org/10.1007/s11306-018-1343-y)
97 1. Hegeman et al. [doi:10.1021/ac070346t](https://doi.org/10.1021/ac070346t)
98 2. [Galaxy Project](https://galaxyproject.org/)
99 3. [Workflow4Metabolomics](http://workflow4metabolomics.org/)
100 4. Smith et al. [doi:10.1021/ac051437y](https://www.ncbi.nlm.nih.gov/pubmed/16448051)