comparison VCFCarto_wrapper.xml @ 0:3552a8d9f51c draft

Uploaded
author urgi-team
date Tue, 10 Nov 2015 08:30:56 -0500
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:3552a8d9f51c
1 <tool id="VCFCarto" name="VCFCarto" version="0.01">
2 <description>VCFcarto can convert a tabulated marker file into a file with only the markers from 2 parents </description>
3 <requirements>
4 <requirement type="package" version="1.0">VCF_Gandalf_Tools</requirement>
5 </requirements>
6 <version_command>
7 VCFCarto.py --version
8 </version_command>
9 <command interpreter="python">
10 VCFCarto_wrapper.py -f $inputTabular -o $outputVCFCarto -A $parentA -H $parentH
11 #if str($outputType) == "carto"
12 -p -g --graphHTML $output_html --dirGraphs "$output_html.files_path"
13 #end if
14 #if str($outputType) == "MergedCarto"
15 -p -g --graphHTML $output_html --dirGraphs "$output_html.files_path" -m --mergeFile $output_bed
16 #end if
17 </command>
18 <inputs>
19 <param name="inputTabular" type="data" format="tabular" label="indicate your tabulated marker file"/>
20 <param name="parentA" size="20" type="text" value="V1" label="indicate parent 1 name (A)"/>
21 <param name="parentH" size="20" type="text" value="V2" label="indicate parent 2 name (H)"/>
22 <param name="outputType" type="select" display="radio" label="select type of output" multiple="False">
23 <option value="raw" >7 caracter code</option>
24 <option value="carto" >A - H code</option>
25 <option value="MergedCarto" >A - H code and merge</option>
26 </param>
27 </inputs>
28 <outputs>
29 <data format="tabular" name="outputVCFCarto" label="${tool.name} on ${on_string} (tabular)"/>
30 <data format="html" name="output_html" label="${tool.name} graphs on ${on_string} (html)">
31 <filter>not outputType == "raw"</filter>
32 </data>
33 <data format="bed" name="output_bed" label="${tool.name} markers on ${on_string} (bed)">
34 <filter>outputType == "MergedCarto"</filter>
35 </data>
36 </outputs>
37 <tests>
38 <test>
39 <param name="inputTabular" value="VCFCarto_input.tab"/>
40 <param name="parentA" value="REF1"/>
41 <param name="parentH" value="REF2"/>
42 <param name="outputType" value="raw"/>
43 <output name="outputVCFCarto" file="VCFCarto_output.tab" ftype="tabular"/>
44 </test>
45 <test>
46 <param name="inputTabular" value="VCFCarto_input.tab"/>
47 <param name="parentA" value="REF1"/>
48 <param name="parentH" value="REF2"/>
49 <param name="outputType" value="MergedCarto"/>
50 <output name="outputVCFCarto" file="VCFCarto_output_merged.tab" ftype="tabular"/>
51 <output name="output_bed" file="VCFCarto_output_merged.bed" ftype="bed"/>
52 </test>
53 </tests>
54 <help><![CDATA[
55
56 **VCFcarto converts a tabulated marker file into a file with only the markers from 2 parents**
57
58 .. class:: infomark
59
60 expected input format is the output from VCFStorage.
61
62 -----
63
64 **what it does :**
65
66 VCFcarto converts a tabulated marker file into a file with only the markers from 2 parents, refA and refH.
67
68 2 formats are possible, either the input format is conserved, or the format is changed into a 3 letter format
69
70 -----
71
72 **input format :**
73
74 .. class:: infomark
75
76 expected input format is the output from VCFStorage.
77
78 the expected format is a tab delimited format file where all genomic positions are in rows, and all strains are in columns
79
80 For each position and each genome, a code is attributed :
81
82 - for the reference : ::
83
84 A,T,G,C for the corresponding nucleotidic acid
85
86 - for the genomes : ::
87
88 U if the position was not refered in the VCF file
89 R if the base is similar to the reference
90 F if the base has been filtered out
91 A,T,G,C if the genome has a validated SNP at the position
92
93 -----
94
95 **output format :**
96
97 for the main output, 2 formats are possible :
98
99 - The first format is similar to the input format (same columns and code) but will only be conserved lines where the 2 parents have different alleles.
100
101 - The second format (A - H format) will have a much simpler code ::
102
103 "A" when the strain allele is the same as parent A
104 "H" when the strain allele is the same as parent H
105 "-" in any other case (base filtered out, different base, base unmapped etc...)
106
107 the second format may be used as an input for a cartographic tool.
108
109 If you decide to have the A - H format, you can also merge consecutive markers that carries the same information (every strains are similars between the two markers). If you decide to do so, new markers will be generated and a bed file will do the link between the input and the output markers.
110
111 Finally, graphical output will be displayed to visualise the result.
112
113 -----
114
115 **example :**
116
117 input : ::
118
119 CHROM POS reference REF1 G01 REF2 G02 G03 G04 G05 G06 G07 G08 G09 G10 G11 G12
120 Chr1 1 A R R R R U R R R R R R R R R
121 Chr1 2 T R R R R R U R R R R R R R R
122 Chr1 3 G R R R R R R R R R R R R R R
123 Chr1 4 G R R R R R R R R R R R R F R
124 Chr1 5 G R R R R R R U F R R R R R R
125 Chr1 6 C R R R R R R R R R R R R R U
126 Chr1 7 A G C C C F C C C C C G C G G
127 Chr1 8 G R R R R R R R R R R R R R R
128 Chr1 9 C R T T R T T T U R T R T T T
129 Chr1 10 T R R R R R R R R R R R R R U
130 Chr1 11 T R R R R R R R R R R R F R R
131 Chr1 12 A R R R R U R R R R F R R R R
132 Chr1 13 A R R G G R F R F G R G R R F
133 Chr1 14 A R R R R R R R R F R R R R R
134 Chr1 15 G R R R U R F R R R R R R U U
135 Chr1 16 G A R R A R R U F R R A A R A
136 Chr1 17 A R G G R U R R G G R G U R G
137 Chr1 18 C R R R R R U R R R R R R R R
138 Chr1 19 G C U R C R C U R R C C C R C
139 Chr1 20 G A U R A R A U R R A A A R A
140 Chr1 21 G T U R T R T U R R T T T R T
141 Chr1 22 A T U R T R T U R R T T T R T
142 Chr1 23 C T T R T R R R T R U T R T T
143 Chr1 24 T R R R R R U R R R R R R R F
144 Chr1 25 G R F R R R R R U R F R R R R
145 Chr1 26 T R R C C C C C R R C R C R U
146 Chr1 27 C R R G G G G R G R G R G R R
147 Chr1 28 C G T T T G G T T F T G T T G
148 Chr1 29 G T R R R R T R T R T T R T R
149 Chr1 30 T R R R R R R R R R R R R R R
150 Chr1 31 A R R R R F R R R R F R R R R
151 Chr1 32 A G G R G G G R R G G G G G R
152 Chr1 33 G R R R R R R R R R R R R R R
153 Chr1 34 C R R R R R R R R R R R R R R
154 Chr1 35 C R R R R R F R R R R R R R U
155 Chr2 1 T R R R F R R R R R R R R R R
156 Chr2 2 A C R R C C U R R R R C C C U
157 Chr2 3 C R R R R R R U R R R R R R R
158 Chr2 4 C R R R R R R R U R R R R F R
159 Chr2 5 T R R R R R R R R R R R R R R
160 Chr2 6 C R R R R R R R R R R R R R R
161 Chr2 7 A T F R U R T T T R T T F T T
162 Chr2 8 T R R R R R R R R R R R R R R
163 Chr2 9 C R R R R R R R R R R R R R R
164 Chr2 10 G R T T T T R T R R R R R U R
165 Chr2 11 C R A A A A R A R R R R R U R
166 Chr2 12 A R T T T T R T R R R R R U R
167 Chr2 13 T R C C C C R C R R R R R U R
168 Chr2 14 C T A A T A T A T A T T A A A
169 Chr2 15 T R R R F R R R R R R R R R R
170 Chr2 16 A R R R R R R R U R R R R R R
171 Chr2 17 A R U R R R R R R R R R R R F
172 Chr2 18 G R R R R R R R R R R R R R R
173 Chr2 19 A R R R R R R F R R R R R R R
174 Chr2 20 C R R R R R R R F R R R R R R
175 Chr2 21 G A R R A A A R R R A A R R R
176 Chr2 22 A R R R R R R F R R R R R R R
177 Chr2 23 A R R T T R R T T T T T R R R
178 Chr2 24 T R R R R R R U R R R R R R F
179 Chr2 25 T R A A R R A R A R R A R R A
180 Chr2 26 G R R R R R R R R R R R R R R
181 Chr2 27 A R R R R R R R R R R R R U R
182 Chr2 28 C R U R R F F R R F R F U R R
183 Chr2 29 G R R R R R R F R R R R R R R
184 Chr2 30 T A A G A G G A A G F G G G U
185 Chr2 31 A R R R R R R R R U U R R R R
186 Chr2 32 G R R R R R R U U R R R R R R
187 Chr2 33 G R U R R R R U R R R R R R R
188 Chr2 34 A R R R U R R R R R R R R R R
189 Chr2 35 G R R R R R R R R R R R R R R
190 Chr2 36 T R R R R R R U R R R R R R R
191 Chr3 1 T U R R R R R U R R R R R R R
192 Chr3 2 T R R U R R R U R R R R R R R
193 Chr3 3 T F R R R R R U R R R R R R R
194 Chr3 4 T R R F R R R U R R R R R R R
195
196
197 output :
198
199 - without A - H code : ::
200
201 CHROM POS reference REF1 G01 REF2 G02 G03 G04 G05 G06 G07 G08 G09 G10 G11 G12
202 Chr1 7 A G C C C F C C C C C G C G G
203 Chr1 9 C R T T R T T T U R T R T T T
204 Chr1 13 A R R G G R F R F G R G R R F
205 Chr1 16 G A R R A R R U F R R A A R A
206 Chr1 17 A R G G R U R R G G R G U R G
207 Chr1 19 G C U R C R C U R R C C C R C
208 Chr1 20 G A U R A R A U R R A A A R A
209 Chr1 21 G T U R T R T U R R T T T R T
210 Chr1 22 A T U R T R T U R R T T T R T
211 Chr1 23 C T T R T R R R T R U T R T T
212 Chr1 26 T R R C C C C C R R C R C R U
213 Chr1 27 C R R G G G G R G R G R G R R
214 Chr1 28 C G T T T G G T T F T G T T G
215 Chr1 29 G T R R R R T R T R T T R T R
216 Chr1 32 A G G R G G G R R G G G G G R
217 Chr2 2 A C R R C C U R R R R C C C U
218 Chr2 7 A T F R U R T T T R T T F T T
219 Chr2 10 G R T T T T R T R R R R R U R
220 Chr2 11 C R A A A A R A R R R R R U R
221 Chr2 12 A R T T T T R T R R R R R U R
222 Chr2 13 T R C C C C R C R R R R R U R
223 Chr2 14 C T A A T A T A T A T T A A A
224 Chr2 21 G A R R A A A R R R A A R R R
225 Chr2 23 A R R T T R R T T T T T R R R
226 Chr2 25 T R A A R R A R A R R A R R A
227 Chr2 30 T A A G A G G A A G F G G G U
228
229 - with A - H code but no markers : ::
230
231 CHROM POS reference REF1 G01 REF2 G02 G03 G04 G05 G06 G07 G08 G09 G10 G11 G12
232 Chr1 7 - A H H H - H H H H H A H A A
233 Chr1 9 - A H H A H H H - A H A H H H
234 Chr1 13 - A A H H A - A - H A H A A -
235 Chr1 16 - A H H A H H - - H H A A H A
236 Chr1 17 - A H H A - A A H H A H - A H
237 Chr1 19 - A - H A H A - H H A A A H A
238 Chr1 20 - A - H A H A - H H A A A H A
239 Chr1 21 - A - H A H A - H H A A A H A
240 Chr1 22 - A - H A H A - H H A A A H A
241 Chr1 23 - A A H A H H H A H - A H A A
242 Chr1 26 - A A H H H H H A A H A H A -
243 Chr1 27 - A A H H H H A H A H A H A A
244 Chr1 28 - A H H H A A H H - H A H H A
245 Chr1 29 - A H H H H A H A H A A H A H
246 Chr1 32 - A A H A A A H H A A A A A H
247 Chr2 2 - A H H A A - H H H H A A A -
248 Chr2 7 - A - H - H A A A H A A - A A
249 Chr2 10 - A H H H H A H A A A A A - A
250 Chr2 11 - A H H H H A H A A A A A - A
251 Chr2 12 - A H H H H A H A A A A A - A
252 Chr2 13 - A H H H H A H A A A A A - A
253 Chr2 14 - A H H A H A H A H A A H H H
254 Chr2 21 - A H H A A A H H H A A H H H
255 Chr2 23 - A A H H A A H H H H H A A A
256 Chr2 25 - A H H A A H A H A A H A A H
257 Chr2 30 - A A H A H H A A H - H H H -
258
259 - with A - H code and merge :
260
261 - tab file : ::
262
263 CHROM POS reference REF1 G01 REF2 G02 G03 G04 G05 G06 G07 G08 G09 G10 G11 G12
264 Chr1 *M_00001 - A H H H - H H H H H A H A A
265 Chr1 *M_00002 - A H H A H H H - A H A H H H
266 Chr1 *M_00003 - A A H H A - A - H A H A A -
267 Chr1 *M_00004 - A H H A H H - - H H A A H A
268 Chr1 *M_00005 - A H H A - A A H H A H - A H
269 Chr1 *M_00006 - A - H A H A - H H A A A H A
270 Chr1 *M_00007 - A A H A H H H A H - A H A A
271 Chr1 *M_00008 - A A H H H H H A A H A H A -
272 Chr1 *M_00009 - A A H H H H A H A H A H A A
273 Chr1 *M_00010 - A H H H A A H H - H A H H A
274 Chr1 *M_00011 - A H H H H A H A H A A H A H
275 Chr1 *M_00012 - A A H A A A H H A A A A A H
276 Chr2 *M_00013 - A H H A A - H H H H A A A -
277 Chr2 *M_00014 - A - H - H A A A H A A - A A
278 Chr2 *M_00015 - A H H H H A H A A A A A - A
279 Chr2 *M_00016 - A H H A H A H A H A A H H H
280 Chr2 *M_00017 - A H H A A A H H H A A H H H
281 Chr2 *M_00018 - A A H H A A H H H H H A A A
282 Chr2 *M_00019 - A H H A A H A H A A H A A H
283 Chr2 *M_00020 - A A H A H H A A H - H H H -
284
285 - bed file : ::
286
287 Chr1 7 7 *M_00001
288 Chr1 9 9 *M_00002
289 Chr1 13 13 *M_00003
290 Chr1 16 16 *M_00004
291 Chr1 17 17 *M_00005
292 Chr1 19 22 *M_00006
293 Chr1 23 23 *M_00007
294 Chr1 26 26 *M_00008
295 Chr1 27 27 *M_00009
296 Chr1 28 28 *M_00010
297 Chr1 29 29 *M_00011
298 Chr1 32 32 *M_00012
299 Chr2 2 2 *M_00013
300 Chr2 7 7 *M_00014
301 Chr2 10 13 *M_00015
302 Chr2 14 14 *M_00016
303 Chr2 21 21 *M_00017
304 Chr2 23 23 *M_00018
305 Chr2 25 25 *M_00019
306 Chr2 30 30 *M_00020
307
308
309 -----
310
311 **reference :**
312
313 ]]>
314 </help>
315 </tool>