13
|
1 <tool id="gd_sum_gd_snp" name="Aggregate Individuals" version="1.0.0">
|
|
2 <description>: Append summary columns for a population</description>
|
|
3
|
|
4 <command interpreter="python">
|
|
5 modify_snp_table.py "$input" "$p1_input" "$output" "-1" "-1" "-1" "-1"
|
|
6 #for $individual, $individual_col in zip($input.dataset.metadata.individual_names, $input.dataset.metadata.individual_columns)
|
|
7 #set $arg = '%s:%s' % ($individual_col, $individual)
|
|
8 "$arg"
|
|
9 #end for
|
|
10 </command>
|
|
11
|
|
12 <inputs>
|
|
13 <param name="input" type="data" format="gd_snp" label="SNP dataset" />
|
|
14 <param name="p1_input" type="data" format="gd_indivs" label="Population individuals" />
|
|
15 </inputs>
|
|
16
|
|
17 <outputs>
|
|
18 <data name="output" format="gd_snp" metadata_source="input" />
|
|
19 </outputs>
|
|
20
|
|
21 <tests>
|
|
22 <test>
|
|
23 <param name="input" value="test_in/sample.gd_snp" ftype="gd_snp" />
|
|
24 <param name="p1_input" value="test_in/a.gd_indivs" ftype="gd_indivs" />
|
|
25 <output name="output" file="test_out/modify_snp_table/modify.gd_snp" />
|
|
26 </test>
|
|
27 </tests>
|
|
28
|
|
29 <help>
|
|
30
|
|
31 **Dataset formats**
|
|
32
|
|
33 The input datasets are in gd_snp_ and gd_indivs_ formats.
|
|
34 The output dataset is in gd_snp_ format. (`Dataset missing?`_)
|
|
35
|
|
36 .. _gd_snp: ./static/formatHelp.html#gd_snp
|
|
37 .. _gd_indivs: ./static/formatHelp.html#gd_indivs
|
|
38 .. _Dataset missing?: ./static/formatHelp.html
|
|
39
|
|
40 -----
|
|
41
|
|
42 **What it does**
|
|
43
|
|
44 The user specifies that some of the individuals in a gd_snp dataset form a
|
|
45 "population", by supplying a list that has been previously created using the
|
|
46 Specify Individuals tool. The program appends a
|
|
47 new "entity" (set of four columns) to the gd_snp table, analogous to the columns
|
|
48 for an individual but containing summary data for the population as a group.
|
|
49 These four columns give the total counts for the two alleles, the "genotype" for
|
|
50 the population, and the maximum quality value, taken over all individuals in the
|
|
51 population. If all defined genotypes in the population are 2 (agree with the
|
|
52 reference), then the population's genotype is 2, and similarly for 0; otherwise
|
|
53 the genotype is 1 (unless all individuals have undefined genotype, in which case
|
|
54 it is -1).
|
|
55
|
|
56 -----
|
|
57
|
|
58 **Example**
|
|
59
|
|
60 - input gd_snp::
|
|
61
|
|
62 Contig161_chr1_4641264_4641879 115 C T 73.5 chr1 4641382 C 6 0 2 45 8 0 2 51 15 0 2 72 5 0 2 42 6 0 2 45 10 0 2 57 Y 54 0.323 0
|
|
63 Contig48_chr1_10150253_10151311 11 A G 94.3 chr1 10150264 A 1 0 2 30 1 0 2 30 1 0 2 30 3 0 2 36 1 0 2 30 1 0 2 30 Y 22 +99. 0
|
|
64 Contig20_chr1_21313469_21313570 66 C T 54.0 chr1 21313534 C 4 0 2 39 4 0 2 39 5 0 2 42 4 0 2 39 4 0 2 39 5 0 2 42 N 1 +99. 0
|
|
65 etc.
|
|
66
|
|
67 - input individuals::
|
|
68
|
|
69 9 PB1
|
|
70 13 PB2
|
|
71 17 PB3
|
|
72
|
|
73 - output::
|
|
74
|
|
75 Contig161_chr1_4641264_4641879 115 C T 73.5 chr1 4641382 C 6 0 2 45 8 0 2 51 15 0 2 72 5 0 2 42 6 0 2 45 10 0 2 57 Y 54 0.323 0 29 0 2 72
|
|
76 Contig48_chr1_10150253_10151311 11 A G 94.3 chr1 10150264 A 1 0 2 30 1 0 2 30 1 0 2 30 3 0 2 36 1 0 2 30 1 0 2 30 Y 22 +99. 0 3 0 2 30
|
|
77 Contig20_chr1_21313469_21313570 66 C T 54.0 chr1 21313534 C 4 0 2 39 4 0 2 39 5 0 2 42 4 0 2 39 4 0 2 39 5 0 2 42 N 1 +99. 0 13 0 2 42
|
|
78 etc.
|
|
79
|
|
80 </help>
|
|
81 </tool>
|