view annotate_ends/annotate_ends.xml @ 0:68a3648c7d91 draft default tip

Uploaded
author matteoc
date Thu, 22 Dec 2016 04:45:31 -0500
parents
children
line wrap: on
line source

<tool id="ends_annot" name="Sanger Ends Attacher" version="0.">
 <description> Attach fosmid ends name based on similarity  </description>
 <command> /home/inmare/galaxy/tools/annotate_ends/attach.tags.pl $ends $fos $blast $minid $alnl $out $table</command>
 <description> "approved by the boss" </description>
 <inputs>
 	<param name="ends" type="data" format="fasta" label="multifasta containing the ends of the fosmids" help="fasta only"/>
 	<param name="fos" type="data" format="fasta" label="multifasta of the assembled fosmid" help="fasta only "/>
 	<param name="blast" type="data" format="tabular" label="blast output" help="12 column output only"/>
 	<!--<param name="minid" type="integer" label="minimum identity" value="95"  help="identity cutoff"/>
        <param name="alnl" type="integer" label="minimum alignment length" value="200"  help="minimum alignment length"/>-->

 </inputs>
 <outputs>
 	<data name="out" format="fasta" label="decorated fosmid file"/>
 	<data name="table" format="tabular" label="conversion table"/>
 </outputs>
 <test/>
 <help>
When Sanger sequencing of the fosmid was performed, assembled fosmid might be assigned to their putative clones by sequence similarity. This tool is designed to assist in this process parsing blastN output files. In order for the tool to work properly Sanger ends need to be provided in a single fasta files, and sequences need to be named according to the following convention: "fosmid name" followed by "_" and "F" for forward or "R" for reverse. The fosmid fasta file needs to be used as a query and the search must be performed against a database containing all the contigs derived from the assembly of the fosmid. This tool requires the "standard" (12 column) output from blastN, any other format might cause major flaws.  The output consist in a new fasta file, where prefixes corresponding to fosmid names as provided in the input file are appended to contigs names. The prefix Unf (for unassigned fosmid) is appended to contigs showing no significant similarity to fosmids ends. Minimum alignment length and identity cut-off need to be provided in input, as a rule of thumb alignment length cutoff should be set to about half of the length of the Sanger sequences and identity cutoff should be above 90%.
 </help>
</tool>