ball: galaxy_stubs/FingerprintSimilarityClustering.xml annotate

annotate galaxy_stubs/FingerprintSimilarityClustering.xml @ 2:605370bc1def draft default tip

Uploaded

author	luis
date	Tue, 12 Jul 2016 12:33:33 -0400
parents
children

rev	line source
2 605370bc1def Uploaded luis parents: diff changeset	1 <?xml version='1.0' encoding='UTF-8'?>
605370bc1def Uploaded luis parents: diff changeset	2 <!--This is a configuration file for the integration of a tools into Galaxy (https://galaxyproject.org/). This file was automatically generated using CTD2Galaxy.-->
605370bc1def Uploaded luis parents: diff changeset	3 <!--Proposed Tool Section: [Chemoinformatics]-->
605370bc1def Uploaded luis parents: diff changeset	4 <tool id="FingerprintSimilarityClustering" name="FingerprintSimilarityClustering" version="1.1.0">
605370bc1def Uploaded luis parents: diff changeset	5 <description>fast clustering of compounds using 2D binary fingerprints</description>
605370bc1def Uploaded luis parents: diff changeset	6 <macros>
605370bc1def Uploaded luis parents: diff changeset	7 <token name="@EXECUTABLE@">FingerprintSimilarityClustering</token>
605370bc1def Uploaded luis parents: diff changeset	8 <import>macros.xml</import>
605370bc1def Uploaded luis parents: diff changeset	9 </macros>
605370bc1def Uploaded luis parents: diff changeset	10 <expand macro="stdio"/>
605370bc1def Uploaded luis parents: diff changeset	11 <expand macro="requirements"/>
605370bc1def Uploaded luis parents: diff changeset	12 <command>FingerprintSimilarityClustering
605370bc1def Uploaded luis parents: diff changeset	13
605370bc1def Uploaded luis parents: diff changeset	14 #if $param_t:
605370bc1def Uploaded luis parents: diff changeset	15 -t $param_t
605370bc1def Uploaded luis parents: diff changeset	16 #end if
605370bc1def Uploaded luis parents: diff changeset	17 #if $param_f:
605370bc1def Uploaded luis parents: diff changeset	18 -f $param_f
605370bc1def Uploaded luis parents: diff changeset	19 #end if
605370bc1def Uploaded luis parents: diff changeset	20 #if $param_fp_col:
605370bc1def Uploaded luis parents: diff changeset	21 -fp_col $param_fp_col
605370bc1def Uploaded luis parents: diff changeset	22 #end if
605370bc1def Uploaded luis parents: diff changeset	23 #if $param_id_col:
605370bc1def Uploaded luis parents: diff changeset	24 -id_col $param_id_col
605370bc1def Uploaded luis parents: diff changeset	25 #end if
605370bc1def Uploaded luis parents: diff changeset	26 #if $param_fp_tag:
605370bc1def Uploaded luis parents: diff changeset	27 -fp_tag "$param_fp_tag"
605370bc1def Uploaded luis parents: diff changeset	28 #end if
605370bc1def Uploaded luis parents: diff changeset	29 #if $param_id_tag:
605370bc1def Uploaded luis parents: diff changeset	30 -id_tag "$param_id_tag"
605370bc1def Uploaded luis parents: diff changeset	31 #end if
605370bc1def Uploaded luis parents: diff changeset	32 #if $param_tc:
605370bc1def Uploaded luis parents: diff changeset	33 -tc $param_tc
605370bc1def Uploaded luis parents: diff changeset	34 #end if
605370bc1def Uploaded luis parents: diff changeset	35 #if $param_cc:
605370bc1def Uploaded luis parents: diff changeset	36 -cc $param_cc
605370bc1def Uploaded luis parents: diff changeset	37 #end if
605370bc1def Uploaded luis parents: diff changeset	38 #if $param_l:
605370bc1def Uploaded luis parents: diff changeset	39 -l $param_l
605370bc1def Uploaded luis parents: diff changeset	40 #end if
605370bc1def Uploaded luis parents: diff changeset	41 #if $param_nt:
605370bc1def Uploaded luis parents: diff changeset	42 -nt "$param_nt"
605370bc1def Uploaded luis parents: diff changeset	43 #end if
605370bc1def Uploaded luis parents: diff changeset	44 #if $param_sdf_out:
605370bc1def Uploaded luis parents: diff changeset	45 -sdf_out $param_sdf_out
605370bc1def Uploaded luis parents: diff changeset	46 #end if
605370bc1def Uploaded luis parents: diff changeset	47 </command>
605370bc1def Uploaded luis parents: diff changeset	48 <inputs>
605370bc1def Uploaded luis parents: diff changeset	49 <param name="param_t" type="data" format="smi.gz,csv,sdf.gz,sdf,txt.gz,smi,txt,csv.gz" optional="False" value="<class 'CTDopts.CTDopts._Null'>" label="Target library input file" help="(-t) "/>
605370bc1def Uploaded luis parents: diff changeset	50 <param name="param_f" type="integer" min="1" max="2" optional="False" value="0" label="Fingerprint format [1 = binary bitstring, 2 = comma separated feature list]" help="(-f) "/>
605370bc1def Uploaded luis parents: diff changeset	51 <param name="param_fp_col" type="integer" value="-1" label="Column number for comma separated smiles input which contains the fingerprint" help="(-fp_col) "/>
605370bc1def Uploaded luis parents: diff changeset	52 <param name="param_id_col" type="integer" value="-1" label="Column number for comma separated smiles input which contains the molecule identifie" help="(-id_col) "/>
605370bc1def Uploaded luis parents: diff changeset	53 <param name="param_fp_tag" type="text" size="30" value=" " label="Tag name for SDF input which contains the fingerprint" help="(-fp_tag) ">
605370bc1def Uploaded luis parents: diff changeset	54 <sanitizer>
605370bc1def Uploaded luis parents: diff changeset	55 <valid initial="string.printable">
605370bc1def Uploaded luis parents: diff changeset	56 <remove value="'"/>
605370bc1def Uploaded luis parents: diff changeset	57 <remove value="""/>
605370bc1def Uploaded luis parents: diff changeset	58 </valid>
605370bc1def Uploaded luis parents: diff changeset	59 </sanitizer>
605370bc1def Uploaded luis parents: diff changeset	60 </param>
605370bc1def Uploaded luis parents: diff changeset	61 <param name="param_id_tag" type="text" size="30" value=" " label="Tag name for SDF input which contains the molecule identifie" help="(-id_tag) ">
605370bc1def Uploaded luis parents: diff changeset	62 <sanitizer>
605370bc1def Uploaded luis parents: diff changeset	63 <valid initial="string.printable">
605370bc1def Uploaded luis parents: diff changeset	64 <remove value="'"/>
605370bc1def Uploaded luis parents: diff changeset	65 <remove value="""/>
605370bc1def Uploaded luis parents: diff changeset	66 </valid>
605370bc1def Uploaded luis parents: diff changeset	67 </sanitizer>
605370bc1def Uploaded luis parents: diff changeset	68 </param>
605370bc1def Uploaded luis parents: diff changeset	69 <param name="param_tc" type="float" value="0.7" label="Tanimoto cutoff [default: 0.7]" help="(-tc) "/>
605370bc1def Uploaded luis parents: diff changeset	70 <param name="param_cc" type="integer" value="1000" label="Clustering size cutoff [default: 1000]" help="(-cc) "/>
605370bc1def Uploaded luis parents: diff changeset	71 <param name="param_l" type="integer" value="0" label="Number of fingerprints to read" help="(-l) "/>
605370bc1def Uploaded luis parents: diff changeset	72 <param name="param_nt" type="text" size="30" value="1" label="Number of parallel threads to use" help="(-nt) To use all possible threads enter <max> [default: 1]">
605370bc1def Uploaded luis parents: diff changeset	73 <sanitizer>
605370bc1def Uploaded luis parents: diff changeset	74 <valid initial="string.printable">
605370bc1def Uploaded luis parents: diff changeset	75 <remove value="'"/>
605370bc1def Uploaded luis parents: diff changeset	76 <remove value="""/>
605370bc1def Uploaded luis parents: diff changeset	77 </valid>
605370bc1def Uploaded luis parents: diff changeset	78 </sanitizer>
605370bc1def Uploaded luis parents: diff changeset	79 </param>
605370bc1def Uploaded luis parents: diff changeset	80 <param name="param_sdf_out" type="integer" min="0" max="1" optional="True" value="0" label="If input file has SD format, this flag activates writing of clustering information as new tags in a copy of the input SD file" help="(-sdf_out) "/>
605370bc1def Uploaded luis parents: diff changeset	81 </inputs>
605370bc1def Uploaded luis parents: diff changeset	82 <expand macro="advanced_options"/>
605370bc1def Uploaded luis parents: diff changeset	83 <outputs>
605370bc1def Uploaded luis parents: diff changeset	84 <data name="param_stdout" format="text" label="Output from stdout"/>
605370bc1def Uploaded luis parents: diff changeset	85 </outputs>
605370bc1def Uploaded luis parents: diff changeset	86 <help>This tool performs a fast and deterministic semi-hierarchical clustering of input compounds encoded as 2D binary fingerprints.
605370bc1def Uploaded luis parents: diff changeset	87
605370bc1def Uploaded luis parents: diff changeset	88 The method is a multistep workflow which first reduces the number of input fingerprints by removing duplicates. This unique set is forwarded to connected
605370bc1def Uploaded luis parents: diff changeset	89 components decomposition by calculating all pairwise Tanimoto similarities and application of a similarity cutoff value. As a third step, all connected components
605370bc1def Uploaded luis parents: diff changeset	90 which exceed a predefined size are hierarchically clustered using the average linkage clustering criterion. The Kelley method is applied on every hierarchical clustering
605370bc1def Uploaded luis parents: diff changeset	91 to determine a level for cluster selection. Finally, the fingerprint duplicates are remapped onto the final clusters which contain their representatives.
605370bc1def Uploaded luis parents: diff changeset	92
605370bc1def Uploaded luis parents: diff changeset	93 For every final cluster a medoid is calulated. For a single cluster multiple medoids are possible because fingerprint duplicates of a medoid are also marked as medoid.
605370bc1def Uploaded luis parents: diff changeset	94
605370bc1def Uploaded luis parents: diff changeset	95 For every compound the output yields a cluster ID, a medoid tag where '1' indicates the cluster medoid(s) and the average similarity of the compound to all other
605370bc1def Uploaded luis parents: diff changeset	96 cluster members. If the output format is SD, these properties are added as new tags.
605370bc1def Uploaded luis parents: diff changeset	97
605370bc1def Uploaded luis parents: diff changeset	98 ======================================================================================================================================================
605370bc1def Uploaded luis parents: diff changeset	99
605370bc1def Uploaded luis parents: diff changeset	100 Examples:
605370bc1def Uploaded luis parents: diff changeset	101
605370bc1def Uploaded luis parents: diff changeset	102 $ FingerprintSimilarityClustering -t target.sdf -fp_tag FPRINT -f 1 -id_tag NAME
605370bc1def Uploaded luis parents: diff changeset	103 tries to read fingerprints as binary bitstrings (-f 1) from tag <FPRINT> and compound IDs from tag <NAME> of target.sdf input file.
605370bc1def Uploaded luis parents: diff changeset	104 The clustering workflow described is executed on the input molecules with default values.
605370bc1def Uploaded luis parents: diff changeset	105
605370bc1def Uploaded luis parents: diff changeset	106 $ FingerprintSimilarityClustering -t target.csv -fp_col 3 -f 2 -id_col 1
605370bc1def Uploaded luis parents: diff changeset	107 tries to read fingerprints as comma separated integer feature list (-f 2) from column 3 and IDs from column 1 out of a space separated CSV file.
605370bc1def Uploaded luis parents: diff changeset	108 The clustering workflow described is executed on the input molecules with default values.
605370bc1def Uploaded luis parents: diff changeset	109
605370bc1def Uploaded luis parents: diff changeset	110 $ FingerprintSimilarityClustering -t target.sdf -fp_tag FPRINT -f 1 -id_tag NAME -nt max
605370bc1def Uploaded luis parents: diff changeset	111 Same as first example but executed in parallel mode using as many threads as available.
605370bc1def Uploaded luis parents: diff changeset	112
605370bc1def Uploaded luis parents: diff changeset	113 $ FingerprintSimilarityClustering -t target.sdf -fp_tag FPRINT -f 1 -id_tag NAME -tc 0.5 -cc 50
605370bc1def Uploaded luis parents: diff changeset	114 Same as first example but using modified parameters for similarity network generation (tc 0.5) and size of connected components to be clustered (-cc 50).
605370bc1def Uploaded luis parents: diff changeset	115
605370bc1def Uploaded luis parents: diff changeset	116 </help>
605370bc1def Uploaded luis parents: diff changeset	117 </tool>

Mercurial > repos > luis > ball

annotate galaxy_stubs/FingerprintSimilarityClustering.xml @ 2:605370bc1def draft default tip