view fasta_formatter.xml @ 1:1dbb5181c327

Removed excess version attributes.
author Dave Bouvier <dave@bx.psu.edu>
date Tue, 03 Dec 2013 12:36:12 -0500
parents 8f0ae92440b8
children 9457a20156db
line wrap: on
line source

<tool id="cshl_fasta_formatter" version="1.0.0" name="FASTA Width">
	<description>formatter</description>
    <requirements>
        <requirement type="package" version="0.0.13">fastx_toolkit</requirement>
    </requirements>
	<!--
		Note:
			fasta_formatter also has a tabular output mode (-t),
			but Galaxy already contains such a tool, so no need
			to offer the user a duplicated tool.

			So this XML tool only changes the width (line-wrapping) of a
			FASTA file.
	-->
	<command>zcat -f '$input' | fasta_formatter -w $width -o $output</command>
	<inputs>
		<param format="fasta" name="input" type="data" label="Library to re-format" />

		<param name="width" type="integer" value="0" label="New width for nucleotides strings" help="Use 0 for single line out." />
	</inputs>

	<tests>
		<test>
			<!-- Re-format a FASTA file into a single line -->
			<param name="input" value="fasta_formatter1.fasta" /> 
			<param name="width" value="0" />
			<param name="output" file="fasta_formatter1.out" />
		</test>
		<test>
			<!-- Re-format a FASTA file into multiple lines wrapping at 60 charactes -->
			<param name="input" value="fasta_formatter1.fasta" />
			<param name="width" value="60" />
			<param name="output" file="fasta_formatter2.out" />
		</test>
	</tests>

	<outputs>
		<data format="input" name="output" metadata_source="input" />
	</outputs>

<help>
**What it does**

This tool re-formats a FASTA file, changing the width of the nucleotides lines.
  
**TIP:** Outputting a single line (with **width = 0**) can be useful for scripting (with **grep**, **awk**, and **perl**). Every odd line is a sequence identifier, and every even line is a nucleotides line.

--------

**Example**

Input FASTA file (each nucleotides line is 50 characters long)::

    >Scaffold3648
    AGGAATGATGACTACAATGATCAACTTAACCTATCTATTTAATTTAGTTC
    CCTAATGTCAGGGACCTACCTGTTTTTGTTATGTTTGGGTTTTGTTGTTG
    TTGTTTTTTTAATCTGAAGGTATTGTGCATTATATGACCTGTAATACACA
    ATTAAAGTCAATTTTAATGAACATGTAGTAAAAACT
    >Scaffold9299
    CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAG
    TCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAG
    aactggtctttacctTTAAGTTG


Output FASTA file (with width=80)::

    >Scaffold3648
    AGGAATGATGACTACAATGATCAACTTAACCTATCTATTTAATTTAGTTCCCTAATGTCAGGGACCTACCTGTTTTTGTT
    ATGTTTGGGTTTTGTTGTTGTTGTTTTTTTAATCTGAAGGTATTGTGCATTATATGACCTGTAATACACAATTAAAGTCA
    ATTTTAATGAACATGTAGTAAAAACT
    >Scaffold9299
    CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAGTCTTCGGTCATAACACAAACCCAGACCTAC
    GTATATGACAAAGCTAATAGaactggtctttacctTTAAGTTG

Output FASTA file (with width=0 => single line)::

    >Scaffold3648
    AGGAATGATGACTACAATGATCAACTTAACCTATCTATTTAATTTAGTTCCCTAATGTCAGGGACCTACCTGTTTTTGTTATGTTTGGGTTTTGTTGTTGTTGTTTTTTTAATCTGAAGGTATTGTGCATTATATGACCTGTAATACACAATTAAAGTCAATTTTAATGAACATGTAGTAAAAACT
    >Scaffold9299
    CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAGTCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAGaactggtctttacctTTAAGTTG

------

This tool is based on `FASTX-toolkit`__ by Assaf Gordon.

 .. __: http://hannonlab.cshl.edu/fastx_toolkit/
 
</help>
</tool>