Mercurial > repos > malex > secimtools

<tool id="subsetData" name="Subset of Data" version="1.0,0">
    <description>Subset your data base on groups or features.</description>
    <requirements>
        <requirement type="python-module">os</requirement>
        <requirement type="python-module">pandas</requirement>
        <requirement type="python-module">interface</requirement>
    </requirements>
    <command detect_errors="exit_code"><![CDATA[
        subset_data.py
        --input $input
        --design $design
        --uniqID $uniqID
        #if $group
            --group $group
        #end if
        --drops $toDrop
        --out $out
    ]]></command>
    <inputs>
        <param name="input" type="data" format="tabular" label="Wide Dataset" help="Input dataset in wide format and tab separated. If not tab separated see TIP below."/>
        <param name="design" type="data" format="tabular" label="Design File" help="Design file tab separated. Note you need a 'sampleID' column. If not tab separated see TIP below."/>
        <param name="uniqID" type="text" size="30" value="" label="Unique Feature ID" help="Name of the column in your Wide Dataset that has unique Feature IDs."/>
        <param name="group" type="text" size="30" value="" optional="false" label="Group/Treatment [Optional]" help="Name of the column in your Design File that contains group classifications. If none provided the drop will be performed by 'sampleID'."/>
        <param name="toDrop" type="text" size="30" optional="false" label="Group(s)/Sample(s) to drop" help="Name of the Group(s)/Sample(s), comma separeted, that will be removed from your wide datset."/>
    </inputs>
    <outputs>
        <data format="tabular" name="out" label="${tool.name} on ${on_string}: Value"/>
    </outputs>
    <macros>
            <import>macros.xml</import>
    </macros>
    <tests>
     <test>
        <param name="input"  value="ST000006_data.tsv"/>
        <param name="design" value="ST000006_design_names_underscore.tsv"/>
        <param name="uniqID" value="Retention_Index" />
        <param name="group"  value="White_wine_type_and_source" />
        <param name="drops"  value="Chardonnay_ Napa_ CA 2003,Riesling_ CA 2004" />
        <output name="out"   file="ST000006_subset_data_output.tsv" />
     </test>
    </tests>
<help>

@TIP_AND_WARNING@

**Tool Description**

The tool creates new wide format dataset and desing dataset based on the existing wide and desing datasets where only groups specified by the user are present.  User has to choose which group(s) to include in the new datasets.

--------------------------------------------------------------------------------

**Input**

    - Two input datasets are required.

@WIDE@

**NOTE:** The sample IDs must match the sample IDs in the Design File
(below). Extra columns will automatically be ignored.

@METADATA@

@UNIQID@

**Group/Treatment [Optional]**

    - Name of the column in your Design File that contains group classifications. If none provided the drop will be performed by 'sampleID'.

**Group(s)/Sample(s) to drop**

    - Name of the Group(s)/Sample(s), comma separeted, that will be removed from your wide datset.

--------------------------------------------------------------------------------

**Output**

This tool will output two TSV files: a TSV file containing the subset of the original wide format dataset and a TSV file containing the subset of the original design dataset.  Both datasets will contain only the samples belonging to groups selected by the user.


</help>
</tool>
author	malex
date	Mon, 08 Mar 2021 20:55:03 +0000
parents
children