view abims_xcms_fillPeaks.xml @ 3:5a54397dbc97 draft

planemo upload
author lecorguille
date Mon, 22 Feb 2016 16:40:19 -0500
parents 5815e6e11f9b
children 2edfa5e1f719
line wrap: on
line source

<tool id="abims_xcms_fillPeaks" name="xcms.fillPeaks" version="2.0.5">

    <description>Integrate the signal in the region of that peak group not represented and create a new peak</description>

    <requirements>
        <requirement type="package" version="3.1.2">R</requirement>
        <requirement type="binary">Rscript</requirement>
        <requirement type="package" version="1.44.0">xcms</requirement>
        <requirement type="package" version="2.2.0">xcms_w4m_script</requirement>
    </requirements>
    
    <stdio>
        <exit_code range="1:" level="fatal" />
    </stdio>

    <command><![CDATA[
        xcms.r 
        xfunction fillPeaks 
        image $image 

        xsetRdataOutput $xsetRData

        method $method

        ###if $zip_file:
        ##    zipfile $zip_file
        ###end if
        ;
        return=\$?;
        mv log.txt $log;
        cat $log;
        sh -c "exit \$return"
        
    ]]></command>

    <inputs>
        <param name="image" type="data" format="rdata.xcms.group,rdata" label="xset RData file" help="output file from another function xcms (group)" />
        <param name="method" type="select" label="Filling method" help="[method] See the help section below">
            <option value="chrom" selected="true">chrom</option>
            <option value="MSW" >MSW</option>
        </param>
        <!-- To pass planemo test -->
        <!--<param name="zip_file" type="hidden_data" format="no_unzip.zip" label="Zip file" />-->
    </inputs>

    <outputs>
        <data name="xsetRData" format="rdata.xcms.fillpeaks" label="${image.name[:-6]}.fillPeaks.RData" />
        <data name="log" format="txt" label="xset.log.txt"  hidden="true" />
    </outputs>
    
    <tests>
        <test>
            <param name="image" value="xset.group.retcor.group.RData"/>
            <param name="method" value="chrom"/>
            <param name="zip_file" value="sacuri.zip"/>
            <!--<output name="xsetRData" file="xset.group.retcor.group.fillPeaks.RData" />-->
            <output name="log">
                <assert_contents>
                    <has_text text="object with 9 samples" />
                    <has_text text="Time range: 0.7-1139.9 seconds (0-19 minutes)" />
                    <has_text text="Mass range: 50.0019-999.9863 m/z" />
                    <has_text text="Peaks: 157780 (about 17531 per sample)" />
                    <has_text text="Peak Groups: 6761" />
                    <has_text text="Sample classes: bio, blank" />
                </assert_contents>
            </output>
        </test>
    </tests>

    <help><![CDATA[

        
.. class:: infomark

**Authors**  Colin A. Smith csmith@scripps.edu, Ralf Tautenhahn rtautenh@gmail.com, Steffen Neumann sneumann@ipb-halle.de, Paul Benton hpaul.benton08@imperial.ac.uk and Christopher Conley cjconley@ucdavis.edu 

.. class:: infomark

**Galaxy integration** ABiMS TEAM - UPMC/CNRS - Station biologique de Roscoff and Yann Guitton yann.guitton@univ-nantes.fr - part of Workflow4Metabolomics.org [W4M]

 | Contact support@workflow4metabolomics.org for any questions or concerns about the Galaxy implementation of this tool.


---------------------------------------------------

==============
Xcms.fillPeaks
==============

-----------
Description
-----------

For each sample, identify peak groups where that sample is not
represented. For each of those peak groups, integrate the signal
in the region of that peak group and create a new peak.

According to the type of raw-data there are 2
different methods available. for filling gcms/lcms data the method
"chrom" integrates raw-data in the chromatographic domain, whereas
"MSW" is used for peaklists without retention-time information
like those from direct-infusion spectra.



-----------------
Workflow position
-----------------


**Upstream tools**

========================= ================= ================== ==========
Name                      output file       format             parameter
========================= ================= ================== ==========
xcms.group                xset.group.RData  rdata.xcms.group   RData file
========================= ================= ================== ==========


**Downstream tools**

+---------------------------+------------------+-----------------------+
| Name                      | Output file      | Format                |
+===========================+==================+=======================+
|xcms.diffreport            | xset.retcor.RData| rdata.xcms.fillpeaks  |
+---------------------------+------------------+-----------------------+
|xcms.annotateDiffreport    | xset.retcor.RData| rdata.xcms.fillpeaks  |
+---------------------------+------------------+-----------------------+

The output file **xset.fillpeaks** is an RData file. You can continue your analysis using it in **xcms.diffreport** or **xcms.annotateDiffreport** tool as an next step of the workflow.


**General schema of the metabolomic workflow**

.. image:: xcms_fillpeaks_workflow.png



-----------
Input files
-----------

+---------------------------+-----------------------+
| Parameter : num + label   |   Format              |
+===========================+=======================+
| 1 : RData file            |   rdata.xcms.group    |
+---------------------------+-----------------------+


----------
Parameters
----------


Method
------

**chrom**

    | This method produces intensity values for those missing samples by integrating raw data in peak group region. In a given group, the start and ending retention time points for integration are defined by the median start and end points of the other detected peaks. The start and end m/z values are similarly determined. Intensities can be still be zero, which is a rather unusual intensity for a peak.  This is the case if e.g. the raw data was threshholded, and the integration area contains no actual raw intensities, or if one sample is miscalibrated, such the raw data points are (just) outside the integration area.
    | Importantly, if retention time correction data is available, the alignment information is used to more precisely integrate the propper region of the raw data. If the corrected retention time is beyond the end of the raw data, the value will be not-a-number (NaN).

**MSW**

    | "MSW" is used for peaklists without retention-time information like those from direct-infusion spectra.

------------
Output files
------------

xset.fillPeaks.RData : rdata.xcms.fillpeaks format

    | Rdata file that will be used in the **xcms.diffreport** or **xcms.annotateDiffreport** step of the workflow.
    
------

.. class:: infomark 

The output file is an group.RData file. You can continue your analysis using it in **xcms.diffreport** or **xcms.annotateDiffreport** tool.

    
---------------------------------------------------

---------------
Working example
---------------

Input files
-----------

    | RData file -> **xset.retcor.RData**

Parameters
----------

    | method -> **chrom**
    

Output files
------------

    | **xset.fillPeaks.RData: RData file**


---------------------------------------------------

Changelog/News
--------------

**Version 2.0.5 - 10/02/2016**

- BUGFIX: better management of errors. Datasets remained green although the process failed

- UPDATE: refactoring of internal management of inputs/outputs

- UPDATE: refactoring to feed the new report tool


**Version 2.0.2 - 02/06/2015**

- IMPROVEMENT: new datatype/dataset formats (rdata.xcms.raw, rdata.xcms.group, rdata.xcms.retcor ...) will facilitate the sequence of tools and so avoid incompatibility errors.

- IMPROVEMENT: parameter labels have changed to facilitate their reading.


    ]]></help>

    <citations>
        <citation type="doi">10.1021/ac051437y</citation>
        <citation type="doi">10.1093/bioinformatics/btu813</citation>
    </citations>


</tool>