Mercurial > repos > eslerm > vkmz

<tool id="VKMZ" name="VKMZ" version="1.2.0">

  <description>metabolomics formula prediction and van Krevelen diagram generation</description>

  <requirements>
    <requirement type="package" version="2.7">python</requirement>
  </requirements>

  <stdio>
    <exit_code range="1:" level="fatal" />
  </stdio>

  <command detect_errors="aggressive"><![CDATA[
    python $__tool_directory__/vkmz.py
    #if str( $mode.mode_selector ) == "xcms":
      xcms
      --data-matrix $mode.datamatrix
      --sample-metadata $mode.samplemetadata
      --variable-metadata $mode.variablemetadata
    #elif str( $mode.mode_selector ) == "tsv":
      tsv
      --input $mode.input
    #end if
    #if $advanced_input.polarity=="negative":
      --polarity negative
    #else if $advanced_input.polarity=="positive":
      --polarity positive
    #end if
    #if $advanced_input.no_adjustment=="True":
      --no-adjustment
    #end if
    --output vkmz
    --error $prediction.error
    --database $prediction.database_type.database
    #if $prediction.unique=="True":
      --unique
    #end if
    --directory $__tool_directory__/
  ]]></command>

  <inputs>
    <conditional name="mode">
      <param name="mode_selector" type="select" label="Input Data">
        <option value="xcms">xcms</option>
        <option value="tsv">tsv</option>
      </param>
      <when value="xcms">
        <param name="datamatrix" label="XCMS Data Matrix" type="data" format="tabular" help="Select XCMS data matrix" />
        <param name="samplemetadata" label="XCMS Sample Metadata" type="data" format="tabular" help="Select XCMS sample metadata" />
        <param name="variablemetadata" label="XCMS Variable Metadata" type="data" format="tabular" help="Select XCMS variable metadata" />
      </when>
      <when value="tsv">
        <param name="input" label="Tabular input" type="data" format="tabular" help="Select tabular data" />
      </when>
    </conditional>
    <section name="advanced_input" title="Advanced Input Options" expanded="false">
      <param name="polarity" label="Override polarity" type="select" help="Force the polarity of all samples">
        <option value="NA">No override</option>
        <option value="negative">Negative</option>
        <option value="positive">Positive</option>
      </param>
      <param name="no_adjustment" label="Disable polarity based mass adjustment" type="boolean" truevalue="True" help="Use this option if data contains neutral masses." />
    </section>
    <section name="prediction" title="Prediction Options" expanded="true">
      <param name="error" label="Mass Error (PPM)" type="float" value="10.0" min="0" help="Set according to expected mass error in parts-per-million" />
      <conditional name="database_type">
        <param name="database_type_selector" type="select" label="Database Type">
          <option value="heuristic">Heuristically Generated</option>
          <option value="custom">Custom</option>
        </param>
	<when value="heuristic">
          <param name="database" label="Database" type="select" help="Select heuriestically generated database">
            <option value="databases/bmrb-light.tsv">Monoisotopic</option>
            <option value="databases/bmrb-heavy_carbon.tsv">C13 Labeled</option>
            <option value="databases/bmrb-heavy_nitrogen.tsv">N15 Labeled</option>
            <option value="databases/bmrb-heavy.tsv">C13 and C15 Labeled</option>
          </param>
        </when>
        <when value="custom">
          <param name="database" label="Database" type="data" format="tabular" help="Select a custom tabular database" />
        </when>
      </conditional>
      <param name="unique" label="Unique matches" type="boolean" truevalue="True" help="Only output features with a single prediction" />
    </section>
  </inputs>
  <outputs>
    <data format="tabular" name="output" from_work_dir="vkmz.tsv" label="${tool.name}_${mode.mode_selector}_tabular" />
    <data format="html" name="output_html" from_work_dir="vkmz.html" label="${tool.name}_${mode.mode_selector}_html" />
  </outputs>

  <tests>
    <test>
      <conditional name="mode">
        <param name="mode_selector" value="xcms" />
        <param name="datamatrix" value="datamatrix.tabular" />
        <param name="samplemetadata" value="sampleMetadata.tabular" />
        <param name="variablemetadata" value="variableMetadata.tabular" />
      </conditional>
      <param name="error" value="10" />
      <param name="database" value="databases/bmrb-light.tsv" />
      <output name="output">
        <assert_contents>
          <has_text text="0.00016357" />
        </assert_contents>
      </output>
    </test>
    <test>
      <conditional name="mode">
        <param name="mode_selector" value="tsv" />
        <param name="input" value="tabular.tabular" />
      </conditional>
      <param name="error" value="10" />
      <conditional name="database_type">
        <param name="database" value="databases/bmrb-light.tsv" />
      </conditional>
      <output name="output">
        <assert_contents>
          <has_text text="0.00016357" />
        </assert_contents>
      </output>
    </test>

  </tests>

  <help><![CDATA[
==========
VKMZ 1.2.0
==========

VKMZ is a metabolomics prediction and vizualization tool which creates van Krevelen diagrams from mass spectrometry data. A van Krevelen diagram (VKD) plots a molecule on a scatterplot by the molecules oxygen to carbon ratio (O:C) against it's hydrogen to carbon ratio (H:C). Classes of metabolites cluster together on a VKD [0]. Plotting a complex mixture of metabolites on a VKD briefly conveys untargeted metabolomics data.

=============
Documentation
=============

**Input Data**

VKMZ is designed to use XCMS [1] or tabular data as input.

*XCMS* mode requires three files which XCMS generates: the data matrix, sample metadata, and variable metadata files.

*Tabular* mode requires a tab delimited file with the first five columns being: sample_id, polarity, mz, retention_time, and intensity.

**Advanced Input Options**

*Override polarity* allows users to set the polarity of all features to either *Positive* or *Negative*. Set this if input does not contain the correct polarity information. This option should not be used if data contains both positive and negative polarity.

*Disable mass adjustment* prevents converting charged masses to neutral masses.

**Prediction Options**

For each feature VKMZ attempts to predict a molecular formula by comparing the feature's uncharged mass to a database of known formula masses. A prediction is made when a known mass is within a mass error of observed, uncharged, mass. VKMZ finds all predictions for an observed mass within a specified mass error. The prediction with the lowest delta (absolute difference between observed and known mass) is plotted. Features without predictions are discarded. Using low resolution data may result in finding too many predictions per feature to be useful, especially for large mass metabolites.

*Mass error* sets the mass error in parts per million. Mass error will be specific to your mass spectrometer, calibration and other methods. Mass error can be approximated by running similar methods with targeted standards with a range in mass.

*Database* can be set to the provided heuristically generated databases for unlabeled and labeled molecules [2] or to a custom database.

*Unique matches* removes features from the output which have multiple predictions. High mass molecules are affected by this filter more than low mass molecules. This is due to the increased number of possible elemental combinations at a higher mass.

**Tabular Output**

Tabular output contians the columns: sample_id, polarity, mz, rt (retentnion time), intensity, predictions (list of list-elements which contains: predicted mz, predicted formula, and predicted delta), hc (hydrogen to carbon ratio), oc, nc.

**HTML Output**

The HTML web page is an interactive van Krevelen diagram for exploring data.

Predicted features are plotted as circle symbols.

*Min Size* and *Max Size* sets the minimal and maximum area of symbols (absolute scaling).

The *Sizer* dropdown sets the algorithm to size each symbol.

  * *Uniform* sets the symbols of all features to the *Max Size*.
  * *Relative Intensity* sets the symbol size of each feature by the feature's intensity divided by the maxium intensity in the dataset mutiplied by the *Maxium Symbol Size*.
  * *Relative Log Intensity* sets the symbol size of each feature by the feature's log intensity divided by the maxium log intensity in the dataset mutiplied by the *Maxium Symbol Size*.

*Threshold* is a slider which removes low-intensity features. The slider exponentially scales.

  * Setting the slider to 50% removes features with intensities lower than 25% of the maximum inensity.
  * Setting the slider to 75% removes features with intensities lower than 50% of the maximum inensity.

*x-axis* and *y-axis* allow setting axis to alternate elemental ratios.

*Opacity* sets the opacity of feature symbols.

*Visible Samples* sets the visibility of features from given sample IDs. Checked samples are visible. By default all samples are visible.

  ]]></help>

  <citations>
    <citation type="doi">10.1007/s11306-018-1343-y</citation>
    <citation type="doi">10.1021/ac051437y</citation>
    <citation type="doi">10.1021/ac070346t</citation>
  </citations>

</tool>
author	eslerm
date	Tue, 10 Jul 2018 17:58:35 -0400
parents	04079c34452a
children	b0ce669ce794