view fosm_cluster/clusterF.xml @ 0:68a3648c7d91 draft default tip

Uploaded
author matteoc
date Thu, 22 Dec 2016 04:45:31 -0500
parents
children
line wrap: on
line source

<tool id="cluster" name="FosBin" version="0.">
 <command> /home/inmare/galaxy/tools/fosm_cluster $f1 $l $o1 $o2 </command>
 <description> k-means clustering of assembled fosmids.</description>
 <help>The tool was designed to tentatively assign contigs from incomplete fosmid assemblies to clusters, ideally corresponding to single fosmids. Clustering is performed based on tetra-nucleotide frequencies of the contigs and coverage. The current version is only compatible with SPAdes output as coverage is recovered from the fasta headers. Future version migth require a different set of input files. Full details are in Chiara et al. #paper id. Clustering of contigs is performed by a custom script based on the R implementation of the K-means algorithm, using 1500 starting positions for the centroids. The clustering is performed on metrics based on coverage, GC composition and tetra-nucleotide composition of each contig, which are computed directly from the fasta file. The user must input the desired number of clusters, contigs are partitioned accordingly." </help>
 <inputs>
 	<param name="f1" type="data" format="fasta" label="fasta file with contigs" help="currently need to be in SPAdes format"/>
 	<param name="l" type="integer" label="number of clusters" value="5"  help="should correspond to the number of fosmids"/>
 </inputs>
 <outputs>
 	<data name="o1" ftype="tabular" format="txt" label="fosmids to cluster table"/>
	<data name="o2" ftype="fasta" format="fasta" label="modified fasta file, containing cluster identifiers in the header"/>
 </outputs>
  <test>
        <param  name="f1" value="sim1_galaxy.fasta"/>
        <param  name="l" value="9" />
        <o1 name="outfile1" value="res"/>
	<o2 name="outfile2" value="fasta.fas"/>
 </test>
</tool>