Mercurial > repos > antmarge > dataoverview

<tool id="dataOverview" name="Data Overview" version="0.1.0">

    <!-- Margaret Antonio  17.01.08 -->

    <description>summarize Tn-Seq libraries and genome</description>

    <requirements>
        <requirement type="package" version="5.18.1">perl</requirement>
        <requirement type="package" version="2.45">perl_getopt_long</requirement>
        <requirement type="package" version="1.6.922">bioperl</requirement>
    </requirements>

    <command interpreter="perl">
        dataOverview.pl
        -f $fastaFile
        -r $ref
        -w $weight_ceiling
        -c $cutoff
        -o $outfile
        $input
        #for $a in $additionalcsv
        ${a.input2}
        #end for
    </command>

    <inputs>
        <param name="input" type="data" label="csv fitness file"/>
            <repeat name="additionalcsv" title="Additional csv fitness file(s)">
                <param name="input2" type="data" label="Select" />
            </repeat>
        <param format="fasta" name="fastaFile" type="data" label="Fasta file"/>
        <param name="ref" type="data" label="GenBank reference genome"/>
        <param name="cutoff" type="integer" value="10" label="Cutoff"/>
        <param name="weight_ceiling" type="integer" value="50" label="Weight ceiling"/>


    </inputs>

    <outputs>
        <data format="txt" name="outfile" />
    </outputs>

    <help>
        **What it does**

        This tool summarizes the Tn-Seq single insertion libraries and the organism's genome.

        **The options explained**

        The csv fitness file(s): These are the csv (comma separated values) files containing the fitness values that will be used in downstream analyses. Since they should have been produced by the "Calculate Fitness" tool, each line besides the header should represent the following information for an insertion location: position,strand,count_1,count_2,ratio,mt_freq_t1,mt_freq_t2,pop_freq_t1,pop_freq_t2,gene,D,W,nW

        GenBank reference genome: the reference genome of whatever model you're working with, which needs to be in standard genbank format. For more on that format see the genbank website.

        Weight ceiling: This value lets you set a weight ceiling for the weights of fitness values. It's only relevant if you're using weighted algorithms.

        Cutoff: This value lets you ignore the fitness scores of any insertion locations with an average count (the number of counts from t1 and t2 divided by 2) less than it.

        The name of your output file: self-explanatory. Remember to have it end in ".csv".


    </help>

</tool>
author	antmarge
date	Tue, 28 Mar 2017 21:56:19 -0400
parents
children