Mercurial > repos > eslerm > vkmz
comparison README.md @ 1:b02af8eb8e6e draft
planemo upload for repository https://github.com/HegemanLab/VKMZ commit 5e7a43415df3902b44b7623cb2c6ffb8845751ac
author | eslerm |
---|---|
date | Wed, 30 May 2018 13:17:32 -0400 |
parents | 0b8ddf650752 |
children |
comparison
equal
deleted
inserted
replaced
0:0b8ddf650752 | 1:b02af8eb8e6e |
---|---|
1 # VKMZ version 1.0 | 1 # VKMZ version 1.0 |
2 | 2 |
3 VKMZ is a metabolomics vizualization tool which creates van Krevelen diagrams from mass spectrometry data. A van Krevelen diagram (VKD) plots a molecule on a scatterplot based on the molecule's oxygen to carbon ratio (O:C) against it's hydrogen to carbon ratio (H:C). Classes of metabolites cluster together on a VKD [0]. Plotting a complex mixture of metabolites on a VKD can be used to briefly convey untargeted metabolomics data. | 3 VKMZ is a metabolomics vizualization tool which creates van Krevelen diagrams from mass spectrometry data. A van Krevelen diagram (VKD) plots a molecule on a scatterplot based on the molecule's oxygen to carbon ratio (O:C) against it's hydrogen to carbon ratio (H:C). Classes of metabolites cluster together on a VKD [0]. Plotting a complex mixture of metabolites on a VKD can be used to briefly convey untargeted metabolomics data. |
4 | 4 |
5 VKMZ can be used as a standalone tool or on the Galaxy Project web platform [1]. | 5 For each feature in the data VKMZ attempts to predict a molecular formula by comparing the feature's mass to a database of known formula masses though a binary search. Heristically generated databases for labeled and unlabeled data are included with VKMZ [1]. A prediction is made when a known mass is within a mass error of the observed mass. VKMZ finds all predictions for an observed mass within the mass error. The prediction with the lowest delta (absolute difference between an observed and predicted mass) is plotted. Features without predictions are discarded. Using low resolution data may result in finding too many predictions per feature to be useful, especially for large mass metabolites. A VKD is created from predictions and outputed as a tabular and html file. Predictions and original feature information can be found in VKMZ' output. |
6 ## Using VKMZ | |
7 | 6 |
8 VKMZ is designed to use XCMS [2] data as input. Tabular data can also be used as input. For each feature in the data VKMZ attempts to predict it's molecular formula by comparing the features mass to a database of known formula masses. Heristically generated databases for unlabeled and labeled data is included with VKMZ. Users can define their own database. A VKD is created from formulas with predictions and outputed as a webpage and tabular file. | 7 VKMZ can be used as a command line tool or on the Galaxy web platform [2]. A Galaxy wrapper for VKMZ is maintatined in this repository. VKMZ was developed on the Workflow4Metabolomics version of Galaxy [3]. |
8 | |
9 ## Using VKMZ from command line | |
10 | |
11 VKMZ is designed to use data processed by XCMS [4] as input. Tabular data can also be used as input. | |
9 | 12 |
10 ### Input modes | 13 ### Input modes |
11 | 14 |
12 VKMZ has three modes: | 15 VKMZ has three modes: |
13 1. `tsv` mode reads a specially formatted tabular file | 16 1. `xcms` mode reads features from XCMS data |
14 2. `xcms` mode reads features in [XCMS](https://bioconductor.org/packages/release/bioc/html/xcms.html) data | 17 2. `tsv` mode reads a specially formatted tabular file |
15 3. `plot` mode replots VKMZ tabular data | 18 3. `plot` mode replots VKMZ tabular data |
16 | 19 |
17 Select a mode by declaring it as the first argument to `vkmz.py`. | 20 Select a mode by declaring it as the first argument to `vkmz.py`. |
18 | 21 |
19 > **Example:** | 22 > **Example:** |
20 > ``` | 23 > ``` |
21 > python vkmz.py xcms [options] | 24 > python vkmz.py xcms [options] |
22 > ``` | 25 > ``` |
23 | 26 |
24 Different modes take different parameters. | 27 Different modes allow different parameters. |
28 | |
29 ### All modes | |
25 | 30 |
26 All modes require an output parameter: | 31 All modes require an output parameter: |
27 * `--output [FILENAME]` | 32 * `--output [FILENAME]` |
28 * A `.tsv` and/or `.html` will be generated by VKMZ with this paraameter as the file name. | 33 * A `.tsv` and `.html` file will be generated by VKMZ with the given filename |
29 * A `.tsv` and `.html` files generated by VKMZ are named by this option | |
30 | 34 |
31 All modes allow these options: | 35 All modes allow these optional parameters: |
32 * `--plot-type [scatter-2d]` | 36 * `--plot-type [scatter-2d]` |
37 * There is currently only one plot type | |
38 * Default is `scatter-2d` | |
33 * `--size [INTEGER]` | 39 * `--size [INTEGER]` |
34 * Set base size of marker dots of the VKD | 40 * Set base size of marker dots on VKD |
41 * Default size is 5 | |
35 * `--size-algorithm [{1,2}]` | 42 * `--size-algorithm [{1,2}]` |
36 * Choose algorithm to modify marker size | 43 * Choose one of the following algorithms: |
37 1. Uniform base size | 44 * 1: Sets all markers to the base size specified by `--size` |
38 2. Intensity relative size | 45 * Default |
46 * 2: Marker sizes are relative to feature's log intensity | |
39 | 47 |
40 #### xcms and tsv modes | 48 #### xcms and tsv modes |
41 | 49 |
42 Both xcms and tsv mode require the mass error, in parts-per-million, of the mass spectrometer which generated the data: | 50 Both xcms and tsv mode require the mass error, in parts-per-million, of the mass spectrometer which generated the data: |
43 * `--error [PPM_ERROR_NUMBER]` | 51 * `--error [PPM_ERROR_NUMBER]` |
52 * It is critical to set the error correctly | |
44 | 53 |
45 There are several options for xcms and tsv modes: | 54 There are several optional parameters for xcms and tsv modes: |
46 * `--database [DATABASE_FILE]` | 55 * `--no-adjustment` |
47 * default is BMRB's monoisotopic heuristically generated database [3] | 56 * Using this flag disables nominal mass adjustment |
57 * Without this flag VKMZ adjusts feature masses by adding or removing that mass of a proton based on the features polarity | |
58 * `--database [DATABASE_FILE_PATH]` | |
59 * Default is BMRB's monoisotopic heuristically generated database | |
60 * This path is relative | |
48 * `--directory [TOOL_PATH]` | 61 * `--directory [TOOL_PATH]` |
49 * define tool directory | 62 * Explicitly define tool directory |
63 * Sets root directory for database file path | |
50 * `--no-plot` | 64 * `--no-plot` |
51 * disable html plot generation | 65 * Disable html output |
52 | 66 |
53 #### xcms mode | 67 #### xcms mode |
54 | 68 |
55 xcms mode requires tabular files generated by XCMS: | 69 xcms mode requires three tabular files generated by XCMS: |
56 * `--data-matrix [XCMS_DATA_MATRIX_FILE]` | 70 * `--data-matrix [XCMS_DATA_MATRIX_FILE]` |
57 * `--sample-metadata [XCMS_SAMPLE_METADATAFILE]` | 71 * `--sample-metadata [XCMS_SAMPLE_METADATAFILE]` |
58 * `--variable-metadata [XCMS_VARIABLE_METADATAFILE]` | 72 * `--variable-metadata [XCMS_VARIABLE_METADATAFILE]` |
59 | 73 |
60 ##### xcms mode example: | 74 ##### xcms mode example: |
62 python vkmz.py xcms --data-matrix test-data/datamatrix.tabular --sample-metadata test-data/sampleMetadata.tabular --variable-metadata test-data/variableMetadata.tabular --output report --error 3 | 76 python vkmz.py xcms --data-matrix test-data/datamatrix.tabular --sample-metadata test-data/sampleMetadata.tabular --variable-metadata test-data/variableMetadata.tabular --output report --error 3 |
63 ``` | 77 ``` |
64 | 78 |
65 #### tsv mode | 79 #### tsv mode |
66 | 80 |
67 tsv mode requires a tabular file of a specific format as input. | 81 tsv mode requires a tabular file of a specific format as input: |
68 * `--input [TSV FILE]` | 82 * `--input [TSV FILE]` |
69 | 83 |
70 The first five columns of the input tabular file must be: | 84 The first five columns of the input tabular file must be: |
71 | 85 >| sample ID | polarity | mz | retention time | intensity | |
72 | sample ID | polarity | mz | retention time | intensity | | 86 >|-----------|----------|----|----------------|-----------| |
73 |-----------|----------|----|----------------|-----------| | |
74 | 87 |
75 #### plot mode | 88 #### plot mode |
76 | 89 |
77 plot mode reads previously generated VKMZ tabular files to create VKD html files. | 90 plot mode reads a previously generated VKMZ tabular files to create VKD html files. |
78 | 91 |
79 Specifying the VKMZ tabular file is required: | 92 Specifying the VKMZ tabular file is required: |
80 * `--input [VKMZ_TSV_FILE]` | 93 * `--input [VKMZ_TSV_FILE]` |
81 | 94 |
95 ## Special thanks to | |
96 | |
97 Adrian, Art, Eric, Jerry, Kevin, Renata, Stephen, Tim, and Yuan. | |
98 | |
82 ## Citations | 99 ## Citations |
83 | 100 |
84 0. Brockman et al. [doi:10.1007/s11306-018-1343-y](https://doi.org/10.1007/s11306-018-1343-y) | 101 0. Brockman et al. [doi:10.1007/s11306-018-1343-y](https://doi.org/10.1007/s11306-018-1343-y) |
85 1. Galaxy Project [Galaxy](https://github.com/galaxyproject/galaxy) | 102 1. Hegeman et al. [doi:10.1021/ac070346t](https://doi.org/10.1021/ac070346t) |
86 2. Giacomoni et al. [doi:10.1093/bioinformatics/btu813](https://doi.org/10.1093/bioinformatics/btu813) | 103 2. [Galaxy Project](https://galaxyproject.org/) |
87 3. Hegeman et al. [doi:10.1021/ac070346t](https://doi.org/10.1021/ac070346t) | 104 3. [Workflow4Metabolomics](http://workflow4metabolomics.org/) |
105 4. Smith et al. [doi:10.1021/ac051437y](https://www.ncbi.nlm.nih.gov/pubmed/16448051) |