agame_custom_tools: fosm_cluster/clusterF.xml annotate

annotate fosm_cluster/clusterF.xml @ 0:68a3648c7d91 draft default tip

Uploaded

author	matteoc
date	Thu, 22 Dec 2016 04:45:31 -0500
parents
children

rev	line source
0 68a3648c7d91 Uploaded matteoc parents: diff changeset	1 <tool id="cluster" name="FosBin" version="0.">
68a3648c7d91 Uploaded matteoc parents: diff changeset	2 <command> /home/inmare/galaxy/tools/fosm_cluster $f1 $l $o1 $o2 </command>
68a3648c7d91 Uploaded matteoc parents: diff changeset	3 <description> k-means clustering of assembled fosmids.</description>
68a3648c7d91 Uploaded matteoc parents: diff changeset	4 <help>The tool was designed to tentatively assign contigs from incomplete fosmid assemblies to clusters, ideally corresponding to single fosmids. Clustering is performed based on tetra-nucleotide frequencies of the contigs and coverage. The current version is only compatible with SPAdes output as coverage is recovered from the fasta headers. Future version migth require a different set of input files. Full details are in Chiara et al. #paper id. Clustering of contigs is performed by a custom script based on the R implementation of the K-means algorithm, using 1500 starting positions for the centroids. The clustering is performed on metrics based on coverage, GC composition and tetra-nucleotide composition of each contig, which are computed directly from the fasta file. The user must input the desired number of clusters, contigs are partitioned accordingly." </help>
68a3648c7d91 Uploaded matteoc parents: diff changeset	5 <inputs>
68a3648c7d91 Uploaded matteoc parents: diff changeset	6 <param name="f1" type="data" format="fasta" label="fasta file with contigs" help="currently need to be in SPAdes format"/>
68a3648c7d91 Uploaded matteoc parents: diff changeset	7 <param name="l" type="integer" label="number of clusters" value="5" help="should correspond to the number of fosmids"/>
68a3648c7d91 Uploaded matteoc parents: diff changeset	8 </inputs>
68a3648c7d91 Uploaded matteoc parents: diff changeset	9 <outputs>
68a3648c7d91 Uploaded matteoc parents: diff changeset	10 <data name="o1" ftype="tabular" format="txt" label="fosmids to cluster table"/>
68a3648c7d91 Uploaded matteoc parents: diff changeset	11 <data name="o2" ftype="fasta" format="fasta" label="modified fasta file, containing cluster identifiers in the header"/>
68a3648c7d91 Uploaded matteoc parents: diff changeset	12 </outputs>
68a3648c7d91 Uploaded matteoc parents: diff changeset	13 <test>
68a3648c7d91 Uploaded matteoc parents: diff changeset	14 <param name="f1" value="sim1_galaxy.fasta"/>
68a3648c7d91 Uploaded matteoc parents: diff changeset	15 <param name="l" value="9" />
68a3648c7d91 Uploaded matteoc parents: diff changeset	16 <o1 name="outfile1" value="res"/>
68a3648c7d91 Uploaded matteoc parents: diff changeset	17 <o2 name="outfile2" value="fasta.fas"/>
68a3648c7d91 Uploaded matteoc parents: diff changeset	18 </test>
68a3648c7d91 Uploaded matteoc parents: diff changeset	19 </tool>