0
|
1 <tool id="VCFCarto" name="VCFCarto" version="0.01">
|
|
2 <description>VCFcarto can convert a tabulated marker file into a file with only the markers from 2 parents </description>
|
|
3 <requirements>
|
|
4 <requirement type="package" version="1.0">VCF_Gandalf_Tools</requirement>
|
|
5 </requirements>
|
|
6 <version_command>
|
|
7 VCFCarto.py --version
|
|
8 </version_command>
|
|
9 <command interpreter="python">
|
|
10 VCFCarto_wrapper.py -f $inputTabular -o $outputVCFCarto -A $parentA -H $parentH
|
|
11 #if str($outputType) == "carto"
|
|
12 -p -g --graphHTML $output_html --dirGraphs "$output_html.files_path"
|
|
13 #end if
|
|
14 #if str($outputType) == "MergedCarto"
|
|
15 -p -g --graphHTML $output_html --dirGraphs "$output_html.files_path" -m --mergeFile $output_bed
|
|
16 #end if
|
|
17 </command>
|
|
18 <inputs>
|
|
19 <param name="inputTabular" type="data" format="tabular" label="indicate your tabulated marker file"/>
|
|
20 <param name="parentA" size="20" type="text" value="V1" label="indicate parent 1 name (A)"/>
|
|
21 <param name="parentH" size="20" type="text" value="V2" label="indicate parent 2 name (H)"/>
|
|
22 <param name="outputType" type="select" display="radio" label="select type of output" multiple="False">
|
|
23 <option value="raw" >7 caracter code</option>
|
|
24 <option value="carto" >A - H code</option>
|
|
25 <option value="MergedCarto" >A - H code and merge</option>
|
|
26 </param>
|
|
27 </inputs>
|
|
28 <outputs>
|
|
29 <data format="tabular" name="outputVCFCarto" label="${tool.name} on ${on_string} (tabular)"/>
|
|
30 <data format="html" name="output_html" label="${tool.name} graphs on ${on_string} (html)">
|
|
31 <filter>not outputType == "raw"</filter>
|
|
32 </data>
|
|
33 <data format="bed" name="output_bed" label="${tool.name} markers on ${on_string} (bed)">
|
|
34 <filter>outputType == "MergedCarto"</filter>
|
|
35 </data>
|
|
36 </outputs>
|
|
37 <tests>
|
|
38 <test>
|
|
39 <param name="inputTabular" value="VCFCarto_input.tab"/>
|
|
40 <param name="parentA" value="REF1"/>
|
|
41 <param name="parentH" value="REF2"/>
|
|
42 <param name="outputType" value="raw"/>
|
|
43 <output name="outputVCFCarto" file="VCFCarto_output.tab" ftype="tabular"/>
|
|
44 </test>
|
|
45 <test>
|
|
46 <param name="inputTabular" value="VCFCarto_input.tab"/>
|
|
47 <param name="parentA" value="REF1"/>
|
|
48 <param name="parentH" value="REF2"/>
|
|
49 <param name="outputType" value="MergedCarto"/>
|
|
50 <output name="outputVCFCarto" file="VCFCarto_output_merged.tab" ftype="tabular"/>
|
|
51 <output name="output_bed" file="VCFCarto_output_merged.bed" ftype="bed"/>
|
|
52 </test>
|
|
53 </tests>
|
|
54 <help><![CDATA[
|
|
55
|
|
56 **VCFcarto converts a tabulated marker file into a file with only the markers from 2 parents**
|
|
57
|
|
58 .. class:: infomark
|
|
59
|
|
60 expected input format is the output from VCFStorage.
|
|
61
|
|
62 -----
|
|
63
|
|
64 **what it does :**
|
|
65
|
|
66 VCFcarto converts a tabulated marker file into a file with only the markers from 2 parents, refA and refH.
|
|
67
|
|
68 2 formats are possible, either the input format is conserved, or the format is changed into a 3 letter format
|
|
69
|
|
70 -----
|
|
71
|
|
72 **input format :**
|
|
73
|
|
74 .. class:: infomark
|
|
75
|
|
76 expected input format is the output from VCFStorage.
|
|
77
|
|
78 the expected format is a tab delimited format file where all genomic positions are in rows, and all strains are in columns
|
|
79
|
|
80 For each position and each genome, a code is attributed :
|
|
81
|
|
82 - for the reference : ::
|
|
83
|
|
84 A,T,G,C for the corresponding nucleotidic acid
|
|
85
|
|
86 - for the genomes : ::
|
|
87
|
|
88 U if the position was not refered in the VCF file
|
|
89 R if the base is similar to the reference
|
|
90 F if the base has been filtered out
|
|
91 A,T,G,C if the genome has a validated SNP at the position
|
|
92
|
|
93 -----
|
|
94
|
|
95 **output format :**
|
|
96
|
|
97 for the main output, 2 formats are possible :
|
|
98
|
|
99 - The first format is similar to the input format (same columns and code) but will only be conserved lines where the 2 parents have different alleles.
|
|
100
|
|
101 - The second format (A - H format) will have a much simpler code ::
|
|
102
|
|
103 "A" when the strain allele is the same as parent A
|
|
104 "H" when the strain allele is the same as parent H
|
|
105 "-" in any other case (base filtered out, different base, base unmapped etc...)
|
|
106
|
|
107 the second format may be used as an input for a cartographic tool.
|
|
108
|
|
109 If you decide to have the A - H format, you can also merge consecutive markers that carries the same information (every strains are similars between the two markers). If you decide to do so, new markers will be generated and a bed file will do the link between the input and the output markers.
|
|
110
|
|
111 Finally, graphical output will be displayed to visualise the result.
|
|
112
|
|
113 -----
|
|
114
|
|
115 **example :**
|
|
116
|
|
117 input : ::
|
|
118
|
|
119 CHROM POS reference REF1 G01 REF2 G02 G03 G04 G05 G06 G07 G08 G09 G10 G11 G12
|
|
120 Chr1 1 A R R R R U R R R R R R R R R
|
|
121 Chr1 2 T R R R R R U R R R R R R R R
|
|
122 Chr1 3 G R R R R R R R R R R R R R R
|
|
123 Chr1 4 G R R R R R R R R R R R R F R
|
|
124 Chr1 5 G R R R R R R U F R R R R R R
|
|
125 Chr1 6 C R R R R R R R R R R R R R U
|
|
126 Chr1 7 A G C C C F C C C C C G C G G
|
|
127 Chr1 8 G R R R R R R R R R R R R R R
|
|
128 Chr1 9 C R T T R T T T U R T R T T T
|
|
129 Chr1 10 T R R R R R R R R R R R R R U
|
|
130 Chr1 11 T R R R R R R R R R R R F R R
|
|
131 Chr1 12 A R R R R U R R R R F R R R R
|
|
132 Chr1 13 A R R G G R F R F G R G R R F
|
|
133 Chr1 14 A R R R R R R R R F R R R R R
|
|
134 Chr1 15 G R R R U R F R R R R R R U U
|
|
135 Chr1 16 G A R R A R R U F R R A A R A
|
|
136 Chr1 17 A R G G R U R R G G R G U R G
|
|
137 Chr1 18 C R R R R R U R R R R R R R R
|
|
138 Chr1 19 G C U R C R C U R R C C C R C
|
|
139 Chr1 20 G A U R A R A U R R A A A R A
|
|
140 Chr1 21 G T U R T R T U R R T T T R T
|
|
141 Chr1 22 A T U R T R T U R R T T T R T
|
|
142 Chr1 23 C T T R T R R R T R U T R T T
|
|
143 Chr1 24 T R R R R R U R R R R R R R F
|
|
144 Chr1 25 G R F R R R R R U R F R R R R
|
|
145 Chr1 26 T R R C C C C C R R C R C R U
|
|
146 Chr1 27 C R R G G G G R G R G R G R R
|
|
147 Chr1 28 C G T T T G G T T F T G T T G
|
|
148 Chr1 29 G T R R R R T R T R T T R T R
|
|
149 Chr1 30 T R R R R R R R R R R R R R R
|
|
150 Chr1 31 A R R R R F R R R R F R R R R
|
|
151 Chr1 32 A G G R G G G R R G G G G G R
|
|
152 Chr1 33 G R R R R R R R R R R R R R R
|
|
153 Chr1 34 C R R R R R R R R R R R R R R
|
|
154 Chr1 35 C R R R R R F R R R R R R R U
|
|
155 Chr2 1 T R R R F R R R R R R R R R R
|
|
156 Chr2 2 A C R R C C U R R R R C C C U
|
|
157 Chr2 3 C R R R R R R U R R R R R R R
|
|
158 Chr2 4 C R R R R R R R U R R R R F R
|
|
159 Chr2 5 T R R R R R R R R R R R R R R
|
|
160 Chr2 6 C R R R R R R R R R R R R R R
|
|
161 Chr2 7 A T F R U R T T T R T T F T T
|
|
162 Chr2 8 T R R R R R R R R R R R R R R
|
|
163 Chr2 9 C R R R R R R R R R R R R R R
|
|
164 Chr2 10 G R T T T T R T R R R R R U R
|
|
165 Chr2 11 C R A A A A R A R R R R R U R
|
|
166 Chr2 12 A R T T T T R T R R R R R U R
|
|
167 Chr2 13 T R C C C C R C R R R R R U R
|
|
168 Chr2 14 C T A A T A T A T A T T A A A
|
|
169 Chr2 15 T R R R F R R R R R R R R R R
|
|
170 Chr2 16 A R R R R R R R U R R R R R R
|
|
171 Chr2 17 A R U R R R R R R R R R R R F
|
|
172 Chr2 18 G R R R R R R R R R R R R R R
|
|
173 Chr2 19 A R R R R R R F R R R R R R R
|
|
174 Chr2 20 C R R R R R R R F R R R R R R
|
|
175 Chr2 21 G A R R A A A R R R A A R R R
|
|
176 Chr2 22 A R R R R R R F R R R R R R R
|
|
177 Chr2 23 A R R T T R R T T T T T R R R
|
|
178 Chr2 24 T R R R R R R U R R R R R R F
|
|
179 Chr2 25 T R A A R R A R A R R A R R A
|
|
180 Chr2 26 G R R R R R R R R R R R R R R
|
|
181 Chr2 27 A R R R R R R R R R R R R U R
|
|
182 Chr2 28 C R U R R F F R R F R F U R R
|
|
183 Chr2 29 G R R R R R R F R R R R R R R
|
|
184 Chr2 30 T A A G A G G A A G F G G G U
|
|
185 Chr2 31 A R R R R R R R R U U R R R R
|
|
186 Chr2 32 G R R R R R R U U R R R R R R
|
|
187 Chr2 33 G R U R R R R U R R R R R R R
|
|
188 Chr2 34 A R R R U R R R R R R R R R R
|
|
189 Chr2 35 G R R R R R R R R R R R R R R
|
|
190 Chr2 36 T R R R R R R U R R R R R R R
|
|
191 Chr3 1 T U R R R R R U R R R R R R R
|
|
192 Chr3 2 T R R U R R R U R R R R R R R
|
|
193 Chr3 3 T F R R R R R U R R R R R R R
|
|
194 Chr3 4 T R R F R R R U R R R R R R R
|
|
195
|
|
196
|
|
197 output :
|
|
198
|
|
199 - without A - H code : ::
|
|
200
|
|
201 CHROM POS reference REF1 G01 REF2 G02 G03 G04 G05 G06 G07 G08 G09 G10 G11 G12
|
|
202 Chr1 7 A G C C C F C C C C C G C G G
|
|
203 Chr1 9 C R T T R T T T U R T R T T T
|
|
204 Chr1 13 A R R G G R F R F G R G R R F
|
|
205 Chr1 16 G A R R A R R U F R R A A R A
|
|
206 Chr1 17 A R G G R U R R G G R G U R G
|
|
207 Chr1 19 G C U R C R C U R R C C C R C
|
|
208 Chr1 20 G A U R A R A U R R A A A R A
|
|
209 Chr1 21 G T U R T R T U R R T T T R T
|
|
210 Chr1 22 A T U R T R T U R R T T T R T
|
|
211 Chr1 23 C T T R T R R R T R U T R T T
|
|
212 Chr1 26 T R R C C C C C R R C R C R U
|
|
213 Chr1 27 C R R G G G G R G R G R G R R
|
|
214 Chr1 28 C G T T T G G T T F T G T T G
|
|
215 Chr1 29 G T R R R R T R T R T T R T R
|
|
216 Chr1 32 A G G R G G G R R G G G G G R
|
|
217 Chr2 2 A C R R C C U R R R R C C C U
|
|
218 Chr2 7 A T F R U R T T T R T T F T T
|
|
219 Chr2 10 G R T T T T R T R R R R R U R
|
|
220 Chr2 11 C R A A A A R A R R R R R U R
|
|
221 Chr2 12 A R T T T T R T R R R R R U R
|
|
222 Chr2 13 T R C C C C R C R R R R R U R
|
|
223 Chr2 14 C T A A T A T A T A T T A A A
|
|
224 Chr2 21 G A R R A A A R R R A A R R R
|
|
225 Chr2 23 A R R T T R R T T T T T R R R
|
|
226 Chr2 25 T R A A R R A R A R R A R R A
|
|
227 Chr2 30 T A A G A G G A A G F G G G U
|
|
228
|
|
229 - with A - H code but no markers : ::
|
|
230
|
|
231 CHROM POS reference REF1 G01 REF2 G02 G03 G04 G05 G06 G07 G08 G09 G10 G11 G12
|
|
232 Chr1 7 - A H H H - H H H H H A H A A
|
|
233 Chr1 9 - A H H A H H H - A H A H H H
|
|
234 Chr1 13 - A A H H A - A - H A H A A -
|
|
235 Chr1 16 - A H H A H H - - H H A A H A
|
|
236 Chr1 17 - A H H A - A A H H A H - A H
|
|
237 Chr1 19 - A - H A H A - H H A A A H A
|
|
238 Chr1 20 - A - H A H A - H H A A A H A
|
|
239 Chr1 21 - A - H A H A - H H A A A H A
|
|
240 Chr1 22 - A - H A H A - H H A A A H A
|
|
241 Chr1 23 - A A H A H H H A H - A H A A
|
|
242 Chr1 26 - A A H H H H H A A H A H A -
|
|
243 Chr1 27 - A A H H H H A H A H A H A A
|
|
244 Chr1 28 - A H H H A A H H - H A H H A
|
|
245 Chr1 29 - A H H H H A H A H A A H A H
|
|
246 Chr1 32 - A A H A A A H H A A A A A H
|
|
247 Chr2 2 - A H H A A - H H H H A A A -
|
|
248 Chr2 7 - A - H - H A A A H A A - A A
|
|
249 Chr2 10 - A H H H H A H A A A A A - A
|
|
250 Chr2 11 - A H H H H A H A A A A A - A
|
|
251 Chr2 12 - A H H H H A H A A A A A - A
|
|
252 Chr2 13 - A H H H H A H A A A A A - A
|
|
253 Chr2 14 - A H H A H A H A H A A H H H
|
|
254 Chr2 21 - A H H A A A H H H A A H H H
|
|
255 Chr2 23 - A A H H A A H H H H H A A A
|
|
256 Chr2 25 - A H H A A H A H A A H A A H
|
|
257 Chr2 30 - A A H A H H A A H - H H H -
|
|
258
|
|
259 - with A - H code and merge :
|
|
260
|
|
261 - tab file : ::
|
|
262
|
|
263 CHROM POS reference REF1 G01 REF2 G02 G03 G04 G05 G06 G07 G08 G09 G10 G11 G12
|
|
264 Chr1 *M_00001 - A H H H - H H H H H A H A A
|
|
265 Chr1 *M_00002 - A H H A H H H - A H A H H H
|
|
266 Chr1 *M_00003 - A A H H A - A - H A H A A -
|
|
267 Chr1 *M_00004 - A H H A H H - - H H A A H A
|
|
268 Chr1 *M_00005 - A H H A - A A H H A H - A H
|
|
269 Chr1 *M_00006 - A - H A H A - H H A A A H A
|
|
270 Chr1 *M_00007 - A A H A H H H A H - A H A A
|
|
271 Chr1 *M_00008 - A A H H H H H A A H A H A -
|
|
272 Chr1 *M_00009 - A A H H H H A H A H A H A A
|
|
273 Chr1 *M_00010 - A H H H A A H H - H A H H A
|
|
274 Chr1 *M_00011 - A H H H H A H A H A A H A H
|
|
275 Chr1 *M_00012 - A A H A A A H H A A A A A H
|
|
276 Chr2 *M_00013 - A H H A A - H H H H A A A -
|
|
277 Chr2 *M_00014 - A - H - H A A A H A A - A A
|
|
278 Chr2 *M_00015 - A H H H H A H A A A A A - A
|
|
279 Chr2 *M_00016 - A H H A H A H A H A A H H H
|
|
280 Chr2 *M_00017 - A H H A A A H H H A A H H H
|
|
281 Chr2 *M_00018 - A A H H A A H H H H H A A A
|
|
282 Chr2 *M_00019 - A H H A A H A H A A H A A H
|
|
283 Chr2 *M_00020 - A A H A H H A A H - H H H -
|
|
284
|
|
285 - bed file : ::
|
|
286
|
|
287 Chr1 7 7 *M_00001
|
|
288 Chr1 9 9 *M_00002
|
|
289 Chr1 13 13 *M_00003
|
|
290 Chr1 16 16 *M_00004
|
|
291 Chr1 17 17 *M_00005
|
|
292 Chr1 19 22 *M_00006
|
|
293 Chr1 23 23 *M_00007
|
|
294 Chr1 26 26 *M_00008
|
|
295 Chr1 27 27 *M_00009
|
|
296 Chr1 28 28 *M_00010
|
|
297 Chr1 29 29 *M_00011
|
|
298 Chr1 32 32 *M_00012
|
|
299 Chr2 2 2 *M_00013
|
|
300 Chr2 7 7 *M_00014
|
|
301 Chr2 10 13 *M_00015
|
|
302 Chr2 14 14 *M_00016
|
|
303 Chr2 21 21 *M_00017
|
|
304 Chr2 23 23 *M_00018
|
|
305 Chr2 25 25 *M_00019
|
|
306 Chr2 30 30 *M_00020
|
|
307
|
|
308
|
|
309 -----
|
|
310
|
|
311 **reference :**
|
|
312
|
|
313 ]]>
|
|
314 </help>
|
|
315 </tool>
|