Mercurial > repos > eslerm > vkmz
diff README.md @ 1:b02af8eb8e6e draft
planemo upload for repository https://github.com/HegemanLab/VKMZ commit 5e7a43415df3902b44b7623cb2c6ffb8845751ac
author | eslerm |
---|---|
date | Wed, 30 May 2018 13:17:32 -0400 |
parents | 0b8ddf650752 |
children |
line wrap: on
line diff
--- a/README.md Wed May 02 18:31:06 2018 -0400 +++ b/README.md Wed May 30 13:17:32 2018 -0400 @@ -2,16 +2,19 @@ VKMZ is a metabolomics vizualization tool which creates van Krevelen diagrams from mass spectrometry data. A van Krevelen diagram (VKD) plots a molecule on a scatterplot based on the molecule's oxygen to carbon ratio (O:C) against it's hydrogen to carbon ratio (H:C). Classes of metabolites cluster together on a VKD [0]. Plotting a complex mixture of metabolites on a VKD can be used to briefly convey untargeted metabolomics data. -VKMZ can be used as a standalone tool or on the Galaxy Project web platform [1]. -## Using VKMZ +For each feature in the data VKMZ attempts to predict a molecular formula by comparing the feature's mass to a database of known formula masses though a binary search. Heristically generated databases for labeled and unlabeled data are included with VKMZ [1]. A prediction is made when a known mass is within a mass error of the observed mass. VKMZ finds all predictions for an observed mass within the mass error. The prediction with the lowest delta (absolute difference between an observed and predicted mass) is plotted. Features without predictions are discarded. Using low resolution data may result in finding too many predictions per feature to be useful, especially for large mass metabolites. A VKD is created from predictions and outputed as a tabular and html file. Predictions and original feature information can be found in VKMZ' output. + +VKMZ can be used as a command line tool or on the Galaxy web platform [2]. A Galaxy wrapper for VKMZ is maintatined in this repository. VKMZ was developed on the Workflow4Metabolomics version of Galaxy [3]. -VKMZ is designed to use XCMS [2] data as input. Tabular data can also be used as input. For each feature in the data VKMZ attempts to predict it's molecular formula by comparing the features mass to a database of known formula masses. Heristically generated databases for unlabeled and labeled data is included with VKMZ. Users can define their own database. A VKD is created from formulas with predictions and outputed as a webpage and tabular file. +## Using VKMZ from command line + +VKMZ is designed to use data processed by XCMS [4] as input. Tabular data can also be used as input. ### Input modes VKMZ has three modes: - 1. `tsv` mode reads a specially formatted tabular file - 2. `xcms` mode reads features in [XCMS](https://bioconductor.org/packages/release/bioc/html/xcms.html) data + 1. `xcms` mode reads features from XCMS data + 2. `tsv` mode reads a specially formatted tabular file 3. `plot` mode replots VKMZ tabular data Select a mode by declaring it as the first argument to `vkmz.py`. @@ -21,38 +24,49 @@ > python vkmz.py xcms [options] > ``` -Different modes take different parameters. +Different modes allow different parameters. + +### All modes All modes require an output parameter: * `--output [FILENAME]` - * A `.tsv` and/or `.html` will be generated by VKMZ with this paraameter as the file name. - * A `.tsv` and `.html` files generated by VKMZ are named by this option + * A `.tsv` and `.html` file will be generated by VKMZ with the given filename -All modes allow these options: +All modes allow these optional parameters: * `--plot-type [scatter-2d]` + * There is currently only one plot type + * Default is `scatter-2d` * `--size [INTEGER]` - * Set base size of marker dots of the VKD + * Set base size of marker dots on VKD + * Default size is 5 * `--size-algorithm [{1,2}]` - * Choose algorithm to modify marker size - 1. Uniform base size - 2. Intensity relative size + * Choose one of the following algorithms: + * 1: Sets all markers to the base size specified by `--size` + * Default + * 2: Marker sizes are relative to feature's log intensity #### xcms and tsv modes Both xcms and tsv mode require the mass error, in parts-per-million, of the mass spectrometer which generated the data: * `--error [PPM_ERROR_NUMBER]` + * It is critical to set the error correctly -There are several options for xcms and tsv modes: - * `--database [DATABASE_FILE]` - * default is BMRB's monoisotopic heuristically generated database [3] +There are several optional parameters for xcms and tsv modes: + * `--no-adjustment` + * Using this flag disables nominal mass adjustment + * Without this flag VKMZ adjusts feature masses by adding or removing that mass of a proton based on the features polarity + * `--database [DATABASE_FILE_PATH]` + * Default is BMRB's monoisotopic heuristically generated database + * This path is relative * `--directory [TOOL_PATH]` - * define tool directory + * Explicitly define tool directory + * Sets root directory for database file path * `--no-plot` - * disable html plot generation + * Disable html output #### xcms mode -xcms mode requires tabular files generated by XCMS: +xcms mode requires three tabular files generated by XCMS: * `--data-matrix [XCMS_DATA_MATRIX_FILE]` * `--sample-metadata [XCMS_SAMPLE_METADATAFILE]` * `--variable-metadata [XCMS_VARIABLE_METADATAFILE]` @@ -64,24 +78,28 @@ #### tsv mode -tsv mode requires a tabular file of a specific format as input. +tsv mode requires a tabular file of a specific format as input: * `--input [TSV FILE]` The first five columns of the input tabular file must be: - -| sample ID | polarity | mz | retention time | intensity | -|-----------|----------|----|----------------|-----------| +>| sample ID | polarity | mz | retention time | intensity | +>|-----------|----------|----|----------------|-----------| #### plot mode -plot mode reads previously generated VKMZ tabular files to create VKD html files. +plot mode reads a previously generated VKMZ tabular files to create VKD html files. Specifying the VKMZ tabular file is required: * `--input [VKMZ_TSV_FILE]` +## Special thanks to + +Adrian, Art, Eric, Jerry, Kevin, Renata, Stephen, Tim, and Yuan. + ## Citations 0. Brockman et al. [doi:10.1007/s11306-018-1343-y](https://doi.org/10.1007/s11306-018-1343-y) -1. Galaxy Project [Galaxy](https://github.com/galaxyproject/galaxy) -2. Giacomoni et al. [doi:10.1093/bioinformatics/btu813](https://doi.org/10.1093/bioinformatics/btu813) -3. Hegeman et al. [doi:10.1021/ac070346t](https://doi.org/10.1021/ac070346t) +1. Hegeman et al. [doi:10.1021/ac070346t](https://doi.org/10.1021/ac070346t) +2. [Galaxy Project](https://galaxyproject.org/) +3. [Workflow4Metabolomics](http://workflow4metabolomics.org/) +4. Smith et al. [doi:10.1021/ac051437y](https://www.ncbi.nlm.nih.gov/pubmed/16448051)