view fasta_formatter.xml @ 2:9457a20156db draft

planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
author devteam
date Mon, 12 Oct 2015 10:15:31 -0400
parents 1dbb5181c327
children 859422bcb689
line wrap: on
line source

<tool id="cshl_fasta_formatter" version="1.0.0" name="FASTA Width">
    <description>formatter</description>
    <requirements>
        <requirement type="package" version="0.0.13">fastx_toolkit</requirement>
    </requirements>
    <!--
        Note:
            fasta_formatter also has a tabular output mode (-t),
            but Galaxy already contains such a tool, so no need
            to offer the user a duplicated tool.

            So this XML tool only changes the width (line-wrapping) of a
            FASTA file.
    -->
    <command>
<![CDATA[
zcat -f < '$input' | fasta_formatter -w $width -o '$output'
]]>
    </command>
    <inputs>
        <param format="fasta" name="input" type="data" label="Library to re-format" />

        <param name="width" type="integer" value="0" label="New width for nucleotides strings" help="Use 0 for single line out." />
    </inputs>
    <outputs>
        <data format="fasta" name="output" metadata_source="input" />
    </outputs>
    <tests>
        <test>
            <!-- Re-format a FASTA file into a single line -->
            <param name="input" value="fasta_formatter1.fasta" />
            <param name="width" value="0" />
            <output name="output" file="fasta_formatter1.out" />
        </test>
        <test>
            <!-- Re-format a FASTA file into multiple lines wrapping at 60 charactes -->
            <param name="input" value="fasta_formatter1.fasta" />
            <param name="width" value="60" />
            <output name="output" file="fasta_formatter2.out" />
        </test>
    </tests>
    <help>
**What it does**

This tool re-formats a FASTA file, changing the width of the nucleotides lines.

**TIP:** Outputting a single line (with **width = 0**) can be useful for scripting (with **grep**, **awk**, and **perl**). Every odd line is a sequence identifier, and every even line is a nucleotides line.

--------

**Example**

Input FASTA file (each nucleotides line is 50 characters long)::

    >Scaffold3648
    AGGAATGATGACTACAATGATCAACTTAACCTATCTATTTAATTTAGTTC
    CCTAATGTCAGGGACCTACCTGTTTTTGTTATGTTTGGGTTTTGTTGTTG
    TTGTTTTTTTAATCTGAAGGTATTGTGCATTATATGACCTGTAATACACA
    ATTAAAGTCAATTTTAATGAACATGTAGTAAAAACT
    >Scaffold9299
    CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAG
    TCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAG
    aactggtctttacctTTAAGTTG


Output FASTA file (with width=80)::

    >Scaffold3648
    AGGAATGATGACTACAATGATCAACTTAACCTATCTATTTAATTTAGTTCCCTAATGTCAGGGACCTACCTGTTTTTGTT
    ATGTTTGGGTTTTGTTGTTGTTGTTTTTTTAATCTGAAGGTATTGTGCATTATATGACCTGTAATACACAATTAAAGTCA
    ATTTTAATGAACATGTAGTAAAAACT
    >Scaffold9299
    CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAGTCTTCGGTCATAACACAAACCCAGACCTAC
    GTATATGACAAAGCTAATAGaactggtctttacctTTAAGTTG

Output FASTA file (with width=0 => single line)::

    >Scaffold3648
    AGGAATGATGACTACAATGATCAACTTAACCTATCTATTTAATTTAGTTCCCTAATGTCAGGGACCTACCTGTTTTTGTTATGTTTGGGTTTTGTTGTTGTTGTTTTTTTAATCTGAAGGTATTGTGCATTATATGACCTGTAATACACAATTAAAGTCAATTTTAATGAACATGTAGTAAAAACT
    >Scaffold9299
    CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAGTCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAGaactggtctttacctTTAAGTTG

------

This tool is based on `FASTX-toolkit`__ by Assaf Gordon.

 .. __: http://hannonlab.cshl.edu/fastx_toolkit/
    </help>
</tool>