1
|
1 <tool id="abims_hclustering" name="Hierarchical Clustering" version="1.1.2">
|
0
|
2
|
|
3 <description>using ctc R package for java-treeview</description>
|
|
4
|
1
|
5 <command interpreter="Rscript"><![CDATA[
|
|
6 abims_hclustering.r file "$input" method $method link $link keep.hclust FALSE normalization $normalization sep "$sep" dec "$dec" && mv hclust.zip $outputzip
|
|
7 ]]></command>
|
0
|
8
|
|
9 <inputs>
|
|
10 <param name="input" type="data" label="Data Matrix file" format="tabular" help="Matrix of numeric data with headers." />
|
|
11 <param name="method" type="select" label="Distance measure method" help="the distance measure to be used">
|
|
12 <option value="pearson" selected="true">pearson</option>
|
|
13 <option value="euclidean" >euclidean</option>
|
|
14 <option value="maximum" >maximum</option>
|
|
15 <option value="manhattan" >manhattan</option>
|
|
16 <option value="canberra" >canberra</option>
|
|
17 <option value="binary" >binary</option>
|
|
18 <option value="correlation" >correlation</option>
|
|
19 <option value="spearman" >spearman</option>
|
|
20 </param>
|
|
21 <param name="link" type="select" label="Agglomeration/Link method" help="the agglomeration method to be used">
|
|
22 <option value="ward" selected="true">ward</option>
|
|
23 <option value="single" >single</option>
|
|
24 <option value="complete" >complete</option>
|
|
25 <option value="average" >average</option>
|
|
26 <option value="mcquitty" >mcquitty</option>
|
|
27 <option value="median" >median</option>
|
|
28 <option value="centroid" >centroid</option>
|
|
29 </param>
|
|
30 <param name="normalization" type="select" label="Normalization by center and scale" help="Centering is done by subtracting the column means and scaling is done by dividing the (centered) columns of by their standard deviations">
|
|
31 <option value="T" selected="true">TRUE</option>
|
|
32 <option value="F" >FALSE</option>
|
|
33 </param>
|
|
34
|
|
35 <param name="sep" type="select" format="text" optional="true">
|
|
36 <label>Separator of columns</label>
|
|
37 <option value="tabulation">tabulation</option>
|
|
38 <option value="semicolon">;</option>
|
|
39 <option value="comma">,</option>
|
|
40 </param>
|
|
41 <param name="dec" type="text" label="Decimal separator" value="." help="" />
|
|
42
|
|
43 <!--<param name="nr_col_names" type="integer" label="names" value="2" help="number of the column with names of metabolits" />
|
|
44 <param name="from" type="integer" label="from" value="15" help="number of the column starting peak values data (to exlude all metadata)" />
|
|
45 <param name="to" type="integer" label="to" value="30" help="number of the column finishing peak values data (to exlude all metadata)" />
|
|
46 <param name="gr_number" type="integer" label="gr_number" value="2" help="number of groups (conditions)" />
|
|
47 <param name="nb_col_gr" type="text" label="nb_col_gr" value="8,8" help="number of column of each group; separate with coma as indicated; first position coresponding to the first group etc." />
|
|
48 <param name="threshold" type="float" label="threshold" value="0.01" help="max adjusted p.value accepted" />-->
|
|
49
|
|
50 </inputs>
|
|
51
|
|
52 <outputs>
|
1
|
53 <data name="outputzip" format="zip" label="${input.name[:-4]}.heatmap.zip for Java Treeview" />
|
0
|
54 </outputs>
|
|
55
|
|
56 <stdio>
|
|
57 <exit_code range="1:" level="fatal" />
|
|
58 </stdio>
|
|
59
|
1
|
60 <help><![CDATA[
|
0
|
61
|
|
62
|
|
63
|
|
64 .. class:: infomark
|
|
65
|
|
66 **Authors** Gildas Le Corguille ABiMS - UPMC/CNRS - Station Biologique de Roscoff - gildas.lecorguille|at|sb-roscoff.fr
|
|
67
|
|
68 ---------------------------------------------------
|
|
69
|
|
70 =======================
|
|
71 Hierarchical Clustering
|
|
72 =======================
|
|
73
|
|
74 -----------
|
|
75 Description
|
|
76 -----------
|
|
77
|
|
78 This function compute hierachical clustering with function
|
|
79 hcluster and export cluster to Java TreeView files format: jtreeview.sourceforge.net.
|
|
80
|
|
81 This function performs a **hierarchical cluster analysis** using a set
|
|
82 of dissimilarities for the n objects being clustered. Initially,
|
|
83 each object is assigned to its own cluster and then the algorithm
|
|
84 proceeds iteratively, at each stage joining the two most similar
|
|
85 clusters, continuing until there is just a single cluster. At
|
|
86 each stage distances between clusters are recomputed by the
|
|
87 Lance-Williams dissimilarity update formula according to the
|
|
88 particular clustering method being used.
|
|
89
|
|
90 A number of different **clustering methods** are provided. **Ward's**
|
|
91 minimum variance method aims at finding compact, spherical
|
|
92 clusters. The **complete linkage** method finds similar clusters.
|
|
93 The **single linkage** method (which is closely related to the
|
|
94 minimal spanning tree) adopts a ‘friends of friends’ clustering
|
|
95 strategy. The other methods can be regarded as aiming for
|
|
96 clusters with characteristics somewhere between the single and
|
|
97 complete link methods. Note however, that methods **median** and
|
|
98 **centroid** are not leading to a monotone distance measure,
|
|
99 or equivalently the resulting dendrograms can have so called
|
|
100 inversions (which are hard to interpret).
|
|
101
|
|
102
|
|
103
|
|
104
|
1
|
105 -----------------
|
|
106 Workflow position
|
|
107 -----------------
|
|
108
|
|
109
|
|
110 **Upstream tools**
|
|
111
|
|
112 +---------------------------+----------------------------------------+--------+------------------------+
|
|
113 | Name | Output file | Format | parameter |
|
|
114 +===========================+========================================+========+========================+
|
|
115 |xcms.diffreport |xset.diffreport.data_matrix.tsv | Tabular| Data table file |
|
|
116 +---------------------------+----------------------------------------+--------+------------------------+
|
|
117 |CAMERA.annotateDiffreport |xset.annotatediffreport.data_matrix.tsv | Tabular| Data table file |
|
|
118 +---------------------------+----------------------------------------+--------+------------------------+
|
|
119 |Anova |xset.anova_filtered.tabular | Tabular| Data table file |
|
|
120 +---------------------------+----------------------------------------+--------+------------------------+
|
|
121
|
|
122
|
|
123
|
|
124 **Downstream tools**
|
|
125
|
|
126 +---------------------------+-----------------------------------------------+---------------------+
|
|
127 | Name | Output file | Format |
|
|
128 +===========================+===============================================+=====================+
|
|
129 |Treeview (out of Galaxy) | cdt,gtr and atr files needed for Java Treeview|Java Treeview formats|
|
|
130 +---------------------------+-----------------------------------------------+---------------------+
|
|
131
|
|
132
|
|
133
|
0
|
134 -----------
|
|
135 Input files
|
|
136 -----------
|
|
137
|
|
138 +---------------------------+------------+
|
|
139 | Parameter : num + label | Format |
|
|
140 +===========================+============+
|
|
141 | 1 : Data Matrix file | Tabular |
|
|
142 +---------------------------+------------+
|
|
143
|
|
144
|
|
145 ----------
|
|
146 Parameters
|
|
147 ----------
|
|
148
|
|
149
|
|
150 **Agglomeration or Link method:*
|
|
151
|
|
152 A number of different clustering methods are provided. Ward's minimum variance method aims at finding compact, spherical clusters.
|
|
153 The complete linkage method finds similar clusters. The single linkage method (which is closely related to the minimal spanning tree) adopts a ‘friends of friends’ clustering strategy.
|
|
154 The other methods can be regarded as aiming for clusters with characteristics somewhere between the single and complete link methods.
|
|
155 Note however, that methods median and centroid are not leading to a monotone distance measure, or equivalently the resulting dendrograms can have so called inversions (which are hard to interpret).
|
|
156
|
|
157
|
1
|
158
|
0
|
159 ------------
|
|
160 Output files
|
|
161 ------------
|
|
162
|
|
163 ***.tab.hclust.zip**
|
|
164
|
|
165 | A zip file containing three files (hclust.atr, hclust.cdt and hclust.gtr) that are Treeview format. If you want to have more informations or download Treeview, you can visit the webiste:
|
|
166 | http://jtreeview.sourceforge.net
|
|
167
|
|
168
|
|
169
|
|
170 ------
|
|
171
|
|
172 .. class:: infomark
|
|
173
|
|
174 You can continue your analysis using Treeview (outside of Galaxy) with the three files (atr,cdt and gtr) within the **xset.tab.hclust.zip** output.
|
|
175
|
|
176
|
|
177
|
|
178
|
|
179 ---------------------------------------------------
|
|
180
|
|
181 ---------------
|
|
182 Working example
|
|
183 ---------------
|
|
184
|
|
185
|
|
186 Input files
|
|
187 -----------
|
|
188
|
|
189 **>A part of an example of Data Matrix file input**
|
|
190
|
|
191
|
|
192 +--------+------------------+----------------+
|
1
|
193 | Name | Bur-eH_FSP_102 | Bur-eH_FSP_22 |
|
0
|
194 +========+==================+================+
|
|
195 |M202T601| 91206595.7559783 |106808979.08546 |
|
|
196 +--------+------------------+----------------+
|
|
197 |M234T851| 27249137.275504 |28824971.3177926|
|
|
198 +--------+------------------+----------------+
|
|
199
|
|
200
|
|
201 Parameters
|
|
202 ----------
|
|
203
|
|
204 | Distance measure method -> **pearson**
|
|
205 | Agglomeration/Link method -> **ward**
|
|
206 | Normalization by center and scale -> **TRUE**
|
|
207 | Separator of columns -> **tabulation**
|
|
208 | Decimal separator: -> **.**
|
|
209
|
|
210
|
|
211
|
|
212 Output files
|
|
213 ------------
|
|
214
|
|
215 **Example of an dendrogram/heatmap generated by the Treeview tool**:
|
|
216
|
|
217 .. image:: hclust.png
|
|
218
|
|
219
|
|
220 </help>
|
|
221
|
1
|
222 ]]></tool>
|