view w4mjoinpn.xml @ 0:948bac693947 draft

planemo upload for repository https://github.com/HegemanLab/w4mjoinpn_galaxy_wrapper/tree/master commit cedf2e01903099ef5f1bbe624afe4c2845d6bf23
author eschen42
date Sun, 29 Oct 2017 10:05:05 -0400
parents
children dcfaffec48c8
line wrap: on
line source

<tool id="w4mjoinpn" name="Join +/- Ions" version="0.98.1">
  <description>Join positive and negative ionization-mode W4M datasets for the same samples</description>
  <requirements>
     <requirement type="package" version="8.25">coreutils</requirement>
     <requirement type="package" version="4.2.3.dev0">sed</requirement>
  </requirements>
  <stdio>
      <exit_code range="1:" level="fatal" />
  </stdio>
  <command><![CDATA[
      echo "These are the the paths to the tools used by this script:" 1>&2 ;
      which cut sed head paste cat cp bash test 1>&2 ;
      $__tool_directory__/w4mjoinpn.sh
        dmneg $dmneg 
        dmpos $dmpos 
        dmout $dmout 
        smneg $smneg 
        smpos $smpos 
        smout $smout 
        vmneg $vmneg 
        vmpos $vmpos 
        vmout $vmout 
  ]]></command>
  <inputs>
    <param name="dmpos" label="Data matrix positive" type="data" format="tabular" help="Positive ionization-mode: Features x samples (tabular data - decimal: '.'; missing: NA; mode: numerical; separator: tab character)" />
    <param name="smpos" label="Sample metadata positive" type="data" format="tabular" help="Positive ionization-mode: Samples x metadata (tabular data - decimal: '.'; missing: NA; mode: character or numerical; separator: tab character)" />
    <param name="vmpos" label="Variable metadata positive" type="data" format="tabular" help="Positive ionization-mode: Features x metadata (tabular data - decimal: '.'; missing: NA; mode: character or numerical; separator: tab character)" />
    <param name="dmneg" label="Data matrix negative" type="data" format="tabular" help="Negative ionization-mode: Features x samples (tabular data - decimal: '.'; missing: NA; mode: numerical; separator: tab character)" />
    <param name="smneg" label="Sample metadata negative" type="data" format="tabular" help="Negative ionization-mode: Samples x metadata (tabular data - decimal: '.'; missing: NA; mode: character or numerical; separator: tab character)" />
    <param name="vmneg" label="Variable metadata negative" type="data" format="tabular" help="Negative ionization-mode: Features x metadata (tabular data - decimal: '.'; missing: NA; mode: character or numerical; separator: tab character)" />
  </inputs>
  <outputs>
    <data name="dmout" label="${dmneg.name}.posneg" format="tabular" ></data>
    <data name="smout" label="${smneg.name}.posneg" format="tabular" ></data>
    <data name="vmout" label="${vmneg.name}.posneg" format="tabular" ></data>
  </outputs>
  <help><![CDATA[
**Join positive and negative ionization-mode W4M datasets for the same samples**
--------------------------------------------------------------------------------

**Author** - Arthur Eschenlauer (University of Minnesota, esch0041@umn.edu)


Motivation
----------

Workflow4Metabolomics (W4M, Giacomoni *et al.*, 2014; http://workflow4metabolomics.org; https://github.com/workflow4metabolomics) 
provides a suite of Galaxy tools for processing and analyzing metabolomics data.
  
W4M uses the XCMS package (Smith *et al.*, 2006) to extract features and align 
their retention times among multiple samples. 

After peak extraction and alignment, W4M uses the CAMERA package (Kuhl *et al.*, 2012) 
"to postprocess XCMS feature lists and to collect all features related to a compound into a compound spectrum."

Both of these steps are done using data collected in a single ionization mode (i.e., only negative or only positive)
because it would not make sense to attempt to use CAMERA otherwise.

However, performing and interpreting statistical analysis would be more convenient and statistically powerful 
with all variables (features), negative and positive, combined for one analysis.


Description
-----------

This tool joins two sets of MS1 datasets for **exactly** the same set of samples, where one was gathered in positive ionization
mode and the other in negative ionization-mode, for reasons set forth above.  These datasets must be post-XCMS and post-CAMERA.

This tool will fail:

* when the samples are not listed in exactly the same order in the negative-mode dataMatrix and the positive-mode dataMatrix
* when the samples are not listed in exactly the same order in the negative-mode sampleMetadata and the positive-mode sampleMetadata

Otherwise:

* the two dataMatrix files are concatenated, and the names of features identified from positive ionization-mode data are prefixed with "P"; negative, with "N".
* the two variableMetadata files are concatenated, and the names of features are prefixed in the same way.
* if sampleMetadata has a polarity column, its value is set to "posneg" in the output.  (In fact, the sampleMetadata file in the output is copied from the negative ionization-mode sampleMetadata, with the polarity replaced.)

Workflow Position
-----------------

* Upstream tool category: Preprocessing
* Downstream tool categories: Normalisation, Statistical Analysis, Quality Control, Filter and Sort

Working example
---------------

**Input files**

  +---------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
  | Input File                                  | Download from URL                                                                                                     |
  +=============================================+=======================================================================================================================+
  | Data matrix, negative ionization-mode       | https://raw.githubusercontent.com/HegemanLab/w4mjoinpn_galaxy_wrapper/master/test-data/input_dataMatrix_neg.tsv       |
  +---------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
  | Sample metadata, negative ionization-mode   | https://raw.githubusercontent.com/HegemanLab/w4mjoinpn_galaxy_wrapper/master/test-data/input_sampleMetadata_neg.tsv   |
  +---------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
  | Variable metadata, negative ionization-mode | https://raw.githubusercontent.com/HegemanLab/w4mjoinpn_galaxy_wrapper/master/test-data/input_variableMetadata_neg.tsv |
  +---------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
  | Data matrix, positive ionization-mode       | https://raw.githubusercontent.com/HegemanLab/w4mjoinpn_galaxy_wrapper/master/test-data/input_dataMatrix_pos.tsv       |
  +---------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
  | Sample metadata, positive ionization-mode   | https://raw.githubusercontent.com/HegemanLab/w4mjoinpn_galaxy_wrapper/master/test-data/input_sampleMetadata_pos.tsv   |
  +---------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
  | Variable metadata, positive ionization-mode | https://raw.githubusercontent.com/HegemanLab/w4mjoinpn_galaxy_wrapper/master/test-data/input_variableMetadata_pos.tsv |
  +---------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+

**Output files**

  +---------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
  | Output File                                 | Download from URL                                                                                                     |
  +=============================================+=======================================================================================================================+
  | Data matrix                                 | https://raw.githubusercontent.com/HegemanLab/w4mjoinpn_galaxy_wrapper/master/test-data/output_dataMatrix.tsv          |
  +---------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
  | Sample metadata                             | https://raw.githubusercontent.com/HegemanLab/w4mjoinpn_galaxy_wrapper/master/test-data/output_sampleMetadata.tsv      |
  +---------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+
  | Variable metadata                           | https://raw.githubusercontent.com/HegemanLab/w4mjoinpn_galaxy_wrapper/master/test-data/output_variableMetadata.tsv    |
  +---------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+

  ]]></help>
  <citations>
    <citation type="doi">10.5281/zenodo.1038289</citation>
    <!--
    <citation type="bibtex">
      @misc{
        w4mjoinpn_galaxy_wrapper,
        author = {Eschenlauer, Arthur},
        year = {2017},
        title = {w4mjoinpn_galaxy_wrapper},
        publisher = {GitHub},
        journal = {GitHub repository},
        url = {https://github.com/HegemanLab/w4mjoinpn_galaxy_wrapper},
        doi = 10.5281/zenodo.1038290
      }
    </citation>
    -->
    <!-- Giacomoni, 2014 Workflow4Metabolomics: a collaborative research infrastructure for computational metabolomics -->
    <citation type="doi">10.1093/bioinformatics/btu813</citation>
    <!-- Kuhl et al., 2012 -->
    <citation type="doi">10.1021/ac202450g</citation>
    <!-- Smith, 2006 XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification. -->
    <citation type="doi">10.1021/ac051437y</citation>
  </citations>
</tool>