view fasta_remove_id.xml @ 1:d85af06ab3db draft

Uploaded XML
author curtisross
date Thu, 23 Sep 2021 16:25:45 +0000
parents
children
line wrap: on
line source

<?xml version="1.0"?>
<tool id="edu.tamu.cpt.fasta.remove_desc" name="Remove Description" version="19.1.0.0">
	<description>from fasta file</description>
	<macros>
		<import>macros.xml</import>
		<import>cpt-macros.xml</import>
	</macros>
	<expand macro="requirements"/>
	<command detect_errors="aggressive">
$__tool_directory__/fasta_remove_id.py
@SEQUENCE@
> $out
</command>
	<inputs>
		<expand macro="input/fasta" />
	</inputs>
	<outputs>
		<data format="fasta" name="out" />
	</outputs>
	<tests>
                <test>
                        <param name="sequences" value="T7_DESC.fasta"/>
			<output name="out" file="T7_CLEAN.fasta" />
		</test>
		<test>
			<param name="sequences" value="regex.a3.fa"/>
			<output name="out" file="regex.a3.clean.fa" />
		</test>
	</tests>
	<help>
**What it does**

From an input FASTA file, removes the "description" field (all characters after
the first space in the top line until a return) after the FASTA ID (from the > 
to the first space).
		
This is a permanent removal of the description. It is useful for tools that 
behave in unexpected ways if it is present, e.g. Glimmer/GeneMarkS.

**Example Input/Output**

For an input FASTA file::

	>1|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 288 bp
	acttacgcggagagatgagaccaacgctcgcctaggggcacgcttgtaattgacttatct
	>2|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 232 bp
	gttggggacccacctatcagggagtgtagtagtataagactgtccaataccccccaacat

The resulting FASTA will contain only IDs without a description::

	>1|random
	acttacgcggagagatgagaccaacgctcgcctaggggcacgcttgtaattgacttatct
	>2|random
	gttggggacccacctatcagggagtgtagtagtataagactgtccaataccccccaacat
	</help>
		<expand macro="citations" />
</tool>