Mercurial > repos > lecorguille > hca
comparison abims_hclustering.xml @ 0:2f7381ee5235 draft
Uploaded
author | lecorguille |
---|---|
date | Tue, 30 Jun 2015 06:36:09 -0400 |
parents | |
children | 36fc0a87d7fb |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:2f7381ee5235 |
---|---|
1 <tool id="abims_hclustering" name="Hierarchical Clustering" version="1.1"> | |
2 | |
3 <description>using ctc R package for java-treeview</description> | |
4 | |
5 <command interpreter="Rscript"> | |
6 abims_hclustering.r file "$input" method $method link $link keep.hclust FALSE normalization $normalization sep "$sep" dec "$dec" | |
7 </command> | |
8 | |
9 <inputs> | |
10 <param name="input" type="data" label="Data Matrix file" format="tabular" help="Matrix of numeric data with headers." /> | |
11 <param name="method" type="select" label="Distance measure method" help="the distance measure to be used"> | |
12 <option value="pearson" selected="true">pearson</option> | |
13 <option value="euclidean" >euclidean</option> | |
14 <option value="maximum" >maximum</option> | |
15 <option value="manhattan" >manhattan</option> | |
16 <option value="canberra" >canberra</option> | |
17 <option value="binary" >binary</option> | |
18 <option value="correlation" >correlation</option> | |
19 <option value="spearman" >spearman</option> | |
20 </param> | |
21 <param name="link" type="select" label="Agglomeration/Link method" help="the agglomeration method to be used"> | |
22 <option value="ward" selected="true">ward</option> | |
23 <option value="single" >single</option> | |
24 <option value="complete" >complete</option> | |
25 <option value="average" >average</option> | |
26 <option value="mcquitty" >mcquitty</option> | |
27 <option value="median" >median</option> | |
28 <option value="centroid" >centroid</option> | |
29 </param> | |
30 <param name="normalization" type="select" label="Normalization by center and scale" help="Centering is done by subtracting the column means and scaling is done by dividing the (centered) columns of by their standard deviations"> | |
31 <option value="T" selected="true">TRUE</option> | |
32 <option value="F" >FALSE</option> | |
33 </param> | |
34 | |
35 <param name="sep" type="select" format="text" optional="true"> | |
36 <label>Separator of columns</label> | |
37 <option value="tabulation">tabulation</option> | |
38 <option value="semicolon">;</option> | |
39 <option value="comma">,</option> | |
40 </param> | |
41 <param name="dec" type="text" label="Decimal separator" value="." help="" /> | |
42 | |
43 <!--<param name="nr_col_names" type="integer" label="names" value="2" help="number of the column with names of metabolits" /> | |
44 <param name="from" type="integer" label="from" value="15" help="number of the column starting peak values data (to exlude all metadata)" /> | |
45 <param name="to" type="integer" label="to" value="30" help="number of the column finishing peak values data (to exlude all metadata)" /> | |
46 <param name="gr_number" type="integer" label="gr_number" value="2" help="number of groups (conditions)" /> | |
47 <param name="nb_col_gr" type="text" label="nb_col_gr" value="8,8" help="number of column of each group; separate with coma as indicated; first position coresponding to the first group etc." /> | |
48 <param name="threshold" type="float" label="threshold" value="0.01" help="max adjusted p.value accepted" />--> | |
49 | |
50 </inputs> | |
51 | |
52 <outputs> | |
53 <data name="hclust_zip" format="zip" from_work_dir="hclust.zip" label="${input.name[:-4]}.hclust.zip" /> | |
54 </outputs> | |
55 | |
56 <stdio> | |
57 <exit_code range="1:" level="fatal" /> | |
58 </stdio> | |
59 | |
60 <help> | |
61 | |
62 | |
63 | |
64 .. class:: infomark | |
65 | |
66 **Authors** Gildas Le Corguille ABiMS - UPMC/CNRS - Station Biologique de Roscoff - gildas.lecorguille|at|sb-roscoff.fr | |
67 | |
68 --------------------------------------------------- | |
69 | |
70 ======================= | |
71 Hierarchical Clustering | |
72 ======================= | |
73 | |
74 ----------- | |
75 Description | |
76 ----------- | |
77 | |
78 This function compute hierachical clustering with function | |
79 hcluster and export cluster to Java TreeView files format: jtreeview.sourceforge.net. | |
80 | |
81 This function performs a **hierarchical cluster analysis** using a set | |
82 of dissimilarities for the n objects being clustered. Initially, | |
83 each object is assigned to its own cluster and then the algorithm | |
84 proceeds iteratively, at each stage joining the two most similar | |
85 clusters, continuing until there is just a single cluster. At | |
86 each stage distances between clusters are recomputed by the | |
87 Lance-Williams dissimilarity update formula according to the | |
88 particular clustering method being used. | |
89 | |
90 A number of different **clustering methods** are provided. **Ward's** | |
91 minimum variance method aims at finding compact, spherical | |
92 clusters. The **complete linkage** method finds similar clusters. | |
93 The **single linkage** method (which is closely related to the | |
94 minimal spanning tree) adopts a ‘friends of friends’ clustering | |
95 strategy. The other methods can be regarded as aiming for | |
96 clusters with characteristics somewhere between the single and | |
97 complete link methods. Note however, that methods **median** and | |
98 **centroid** are not leading to a monotone distance measure, | |
99 or equivalently the resulting dendrograms can have so called | |
100 inversions (which are hard to interpret). | |
101 | |
102 | |
103 | |
104 | |
105 ----------- | |
106 Input files | |
107 ----------- | |
108 | |
109 +---------------------------+------------+ | |
110 | Parameter : num + label | Format | | |
111 +===========================+============+ | |
112 | 1 : Data Matrix file | Tabular | | |
113 +---------------------------+------------+ | |
114 | |
115 | |
116 ---------- | |
117 Parameters | |
118 ---------- | |
119 | |
120 | |
121 **Agglomeration or Link method:* | |
122 | |
123 A number of different clustering methods are provided. Ward's minimum variance method aims at finding compact, spherical clusters. | |
124 The complete linkage method finds similar clusters. The single linkage method (which is closely related to the minimal spanning tree) adopts a ‘friends of friends’ clustering strategy. | |
125 The other methods can be regarded as aiming for clusters with characteristics somewhere between the single and complete link methods. | |
126 Note however, that methods median and centroid are not leading to a monotone distance measure, or equivalently the resulting dendrograms can have so called inversions (which are hard to interpret). | |
127 | |
128 | |
129 ------------ | |
130 Output files | |
131 ------------ | |
132 | |
133 ***.tab.hclust.zip** | |
134 | |
135 | A zip file containing three files (hclust.atr, hclust.cdt and hclust.gtr) that are Treeview format. If you want to have more informations or download Treeview, you can visit the webiste: | |
136 | http://jtreeview.sourceforge.net | |
137 | |
138 | |
139 | |
140 ------ | |
141 | |
142 .. class:: infomark | |
143 | |
144 You can continue your analysis using Treeview (outside of Galaxy) with the three files (atr,cdt and gtr) within the **xset.tab.hclust.zip** output. | |
145 | |
146 | |
147 | |
148 | |
149 --------------------------------------------------- | |
150 | |
151 --------------- | |
152 Working example | |
153 --------------- | |
154 | |
155 | |
156 Input files | |
157 ----------- | |
158 | |
159 **>A part of an example of Data Matrix file input** | |
160 | |
161 | |
162 +--------+------------------+----------------+ | |
163 | Name | Bur-eH_FSP_102 | Bur-eH_FSP_22 | | |
164 +========+==================+================+ | |
165 |M202T601| 91206595.7559783 |106808979.08546 | | |
166 +--------+------------------+----------------+ | |
167 |M234T851| 27249137.275504 |28824971.3177926| | |
168 +--------+------------------+----------------+ | |
169 | |
170 | |
171 Parameters | |
172 ---------- | |
173 | |
174 | Distance measure method -> **pearson** | |
175 | Agglomeration/Link method -> **ward** | |
176 | Normalization by center and scale -> **TRUE** | |
177 | Separator of columns -> **tabulation** | |
178 | Decimal separator: -> **.** | |
179 | |
180 | |
181 | |
182 Output files | |
183 ------------ | |
184 | |
185 **Example of an dendrogram/heatmap generated by the Treeview tool**: | |
186 | |
187 .. image:: hclust.png | |
188 | |
189 | |
190 </help> | |
191 | |
192 </tool> |