0
|
1 <tool id="ctb_chemfp_butina_clustering" name="Taylor-Butina Clustering" version="0.1">
|
|
2 <description>of molecular fingerprints</description>
|
|
3 <requirements>
|
|
4 <requirement type="package" version="1.1p1">chemfp</requirement>
|
|
5 <requirement type="package" version="2.3.2">openbabel</requirement>
|
|
6 </requirements>
|
|
7 <command interpreter='python'>
|
|
8 butina_clustering.py
|
|
9 -i $infile
|
|
10 -t $threshold
|
|
11 -o $outfile
|
|
12 -p 4
|
|
13 </command>
|
|
14 <inputs>
|
|
15 <param name="infile" type="data" format="fps" label="Finperprint dataset" help="Dataset missing? See TIP below"/>
|
|
16 <param name='threshold' type='float' value='0.8'/>
|
|
17 </inputs>
|
|
18 <outputs>
|
|
19 <data format="tabular" name="outfile" label="${tool.name} on ${on_string}"/>
|
|
20 </outputs>
|
|
21 <tests>
|
|
22 <test>
|
|
23 <param name="infile" ftype="fps" value="q.fps"/>
|
|
24 <param name='threshold' value='0.8' ></param>
|
|
25 <output name="outfile" ftype="tabular" file='Taylor-Butina_Clustering_on_data_q.txt'/>
|
|
26 </test>
|
|
27 </tests>
|
|
28 <help>
|
|
29
|
|
30
|
|
31 .. class:: infomark
|
|
32
|
|
33 **What this tool does**
|
|
34
|
|
35 Unsupervised non-hierarchical clustering method based on the Taylor-Butina algorithm, which guarantees that every cluster contains molecules which are within a distance cutoff of the central molecule. This tool is based on the chemfp_ project.
|
|
36
|
|
37 .. _chemfp: http://chemfp.com/
|
|
38
|
|
39 -----
|
|
40
|
|
41 .. class:: infomark
|
|
42
|
|
43 **Input**
|
|
44
|
|
45 | Molecular fingerprints in FPS format.
|
|
46 | Open Babel Fastsearch index is not supported.
|
|
47
|
|
48 * Example::
|
|
49
|
|
50 - fingerprints in FPS format
|
|
51
|
|
52 #FPS1
|
|
53 #num_bits=881
|
|
54 #type=CACTVS-E_SCREEN/1.0 extended=2
|
|
55 #software=CACTVS/unknown
|
|
56 #source=/home/mohammed/galaxy-central/database/files/000/dataset_423.dat
|
|
57 #date=2012-02-09T13:20:37
|
|
58 07ce04000000000000000000000000000080060000000c000000000000001a800f0000780008100000701487e960cc0bed3248000580644626004101b4844805901b041c2e
|
|
59 19511e45039b8b2926101609401b13e40800000000000100200000040080000010000002000000000000 55169009
|
|
60 07ce04000000000000000000000000000080060000000c000000000000001a800f0000780008100000701087e960cc0bed3248000580644626004101b4844805901b041c2e
|
|
61 19111e45039b8b2926105609401313e40800000000000100200000040080000010000002000000000000 55079807
|
|
62 ........
|
|
63
|
|
64 - Tanimoto threshold : 0.8 (between 0 and 1)
|
|
65
|
|
66 -----
|
|
67
|
|
68 .. class:: infomark
|
|
69
|
|
70 **Output**
|
|
71
|
|
72 * Example::
|
|
73
|
|
74 0 true singletons
|
|
75 =>
|
|
76
|
|
77 0 false singletons
|
|
78 =>
|
|
79
|
|
80 1 clusters
|
|
81 55091849 has 12 other members
|
|
82 => 6499094 6485578 55079807 3153534 55102353 55091466 55091416 6485577 55169009 55091752 55091467 55168823
|
|
83
|
|
84 -----
|
|
85
|
|
86 .. class:: infomark
|
|
87
|
|
88 **Cite**
|
|
89
|
|
90 The chemfp_ project from Andrew Dalke!
|
|
91
|
|
92 .. _chemfp: http://chemfp.com/
|
|
93
|
|
94 </help>
|
|
95
|
|
96 </tool>
|