Mercurial > repos > matteoc > agame_custom_tools
view fosm_cluster/clusterF.xml @ 0:68a3648c7d91 draft default tip
Uploaded
author | matteoc |
---|---|
date | Thu, 22 Dec 2016 04:45:31 -0500 |
parents | |
children |
line wrap: on
line source
<tool id="cluster" name="FosBin" version="0."> <command> /home/inmare/galaxy/tools/fosm_cluster $f1 $l $o1 $o2 </command> <description> k-means clustering of assembled fosmids.</description> <help>The tool was designed to tentatively assign contigs from incomplete fosmid assemblies to clusters, ideally corresponding to single fosmids. Clustering is performed based on tetra-nucleotide frequencies of the contigs and coverage. The current version is only compatible with SPAdes output as coverage is recovered from the fasta headers. Future version migth require a different set of input files. Full details are in Chiara et al. #paper id. Clustering of contigs is performed by a custom script based on the R implementation of the K-means algorithm, using 1500 starting positions for the centroids. The clustering is performed on metrics based on coverage, GC composition and tetra-nucleotide composition of each contig, which are computed directly from the fasta file. The user must input the desired number of clusters, contigs are partitioned accordingly." </help> <inputs> <param name="f1" type="data" format="fasta" label="fasta file with contigs" help="currently need to be in SPAdes format"/> <param name="l" type="integer" label="number of clusters" value="5" help="should correspond to the number of fosmids"/> </inputs> <outputs> <data name="o1" ftype="tabular" format="txt" label="fosmids to cluster table"/> <data name="o2" ftype="fasta" format="fasta" label="modified fasta file, containing cluster identifiers in the header"/> </outputs> <test> <param name="f1" value="sim1_galaxy.fasta"/> <param name="l" value="9" /> <o1 name="outfile1" value="res"/> <o2 name="outfile2" value="fasta.fas"/> </test> </tool>