comparison modify_snp_table.xml @ 12:4b6590dd7250

Uploaded
author miller-lab
date Wed, 12 Sep 2012 17:10:26 -0400
parents
children
comparison
equal deleted inserted replaced
11:d4ec09e8079f 12:4b6590dd7250
1 <tool id="gd_modify_gd_snp" name="Modify gd_snp" version="1.0.0">
2 <description>modify a gd_snp dataset</description>
3
4 <command interpreter="python">
5 modify_snp_table.py "$input" "$p1_input" "$output"
6 #if $limit_coverage.choice == "0"
7 "-1" "-1" "-1" "-1"
8 #else
9 "${limit_coverage.lo_coverage}" "${limit_coverage.hi_coverage}" "${limit_coverage.low_ind_cov}" "${limit_coverage.lo_quality}"
10 #end if
11 #for $individual, $individual_col in zip($input.dataset.metadata.individual_names, $input.dataset.metadata.individual_columns)
12 #set $arg = '%s:%s' % ($individual_col, $individual)
13 "$arg"
14 #end for
15 </command>
16
17 <inputs>
18 <param name="input" type="data" format="gd_snp" label="gd_snp dataset" />
19 <param name="p1_input" type="data" format="gd_indivs" label="Population individuals" />
20 <conditional name="limit_coverage">
21 <param name="choice" type="select" format="integer" label="Option">
22 <option value="0" selected="true">add columns to the gd_snp table</option>
23 <option value="1">discard some SNPs</option>
24 </param>
25 <when value="0" />
26 <when value="1">
27 <param name="lo_coverage" type="integer" min="0" value="0" label="Lower bound on total coverage" />
28 <param name="hi_coverage" type="integer" min="0" value="1000" label="Upper bound on total coverage" />
29 <param name="low_ind_cov" type="integer" min="0" value="0" label="Lower bound on individual coverage" />
30 <param name="lo_quality" type="integer" min="0" value="0" label="Lower bound on individual quality values" />
31 </when>
32 </conditional>
33 </inputs>
34
35 <outputs>
36 <data name="output" format="gd_snp" metadata_source="input" />
37 </outputs>
38
39 <tests>
40 <test>
41 <param name="input" value="test_in/sample.gd_snp" ftype="gd_snp" />
42 <param name="p1_input" value="test_in/a.gd_indivs" ftype="gd_indivs" />
43 <param name="choice" value="1" />
44 <param name="lo_coverage" value="0" />
45 <param name="hi_coverage" value="1000" />
46 <param name="low_ind_cov" value="3" />
47 <param name="lo_quality" value="30" />
48 <output name="output" file="test_out/modify_snp_table/modify.gd_snp" />
49 </test>
50 </tests>
51
52 <help>
53 **Dataset formats**
54
55 The input datasets are gd_snp_ and gd_indivs_ formats.
56 The output dataset is in gd_snp_ format. (`Dataset missing?`_)
57
58 .. _Dataset missing?: ./static/formatHelp.html
59 .. _gd_snp: ./static/formatHelp.html#gd_snp
60 .. _gd_indivs: ./static/formatHelp.html#gd_indivs
61
62 **What it does**
63
64 The user specifies that some of the individuals in the selected gd_snp_ table are
65 form a "population" that has been previously defined using the Galaxy tool to
66 select individuals from a gd_snp dataset. One option is for the program to append
67 four columns to the table, giving the total counts for the two alleles, the
68 "genotype" for the population and the maximum quality value, taken over all
69 individuals in the population. If all defined genotypes in the population
70 are 2 (agree with the reference), the population's genotype is 2; similarly
71 for 0; otherwise the genotype is 1 (unless all individuals have undefined
72 genotype, in which case it is -1. The other option is to remove rows from
73 the table for which the total coverage for the population is either too low
74 or too high, and/or if the individual coverage or quality value is too low.
75
76 .. _gd_snp: ./static/formatHelp.html#gd_snp
77
78 **Examples**
79
80 - input gd_snp::
81
82 Contig161_chr1_4641264_4641879 115 C T 73.5 chr1 4641382 C 6 0 2 45 8 0 2 51 15 0 2 72 5 0 2 42 6 0 2 45 10 0 2 57 Y 54 0.323 0
83 Contig48_chr1_10150253_10151311 11 A G 94.3 chr1 10150264 A 1 0 2 30 1 0 2 30 1 0 2 30 3 0 2 36 1 0 2 30 1 0 2 30 Y 22 +99. 0
84 Contig20_chr1_21313469_21313570 66 C T 54.0 chr1 21313534 C 4 0 2 39 4 0 2 39 5 0 2 42 4 0 2 39 4 0 2 39 5 0 2 42 N 1 +99. 0
85 etc.
86
87 - input individuals::
88
89 9 PB1
90 13 PB2
91 17 PB3
92
93 - output from appending columns::
94
95 Contig161_chr1_4641264_4641879 115 C T 73.5 chr1 4641382 C 6 0 2 45 8 0 2 51 15 0 2 72 5 0 2 42 6 0 2 45 10 0 2 57 Y 54 0.323 0 29 0 2 72
96 Contig48_chr1_10150253_10151311 11 A G 94.3 chr1 10150264 A 1 0 2 30 1 0 2 30 1 0 2 30 3 0 2 36 1 0 2 30 1 0 2 30 Y 22 +99. 0 3 0 2 30
97 Contig20_chr1_21313469_21313570 66 C T 54.0 chr1 21313534 C 4 0 2 39 4 0 2 39 5 0 2 42 4 0 2 39 4 0 2 39 5 0 2 42 N 1 +99. 0 13 0 2 42
98 etc.
99
100 - output from filter SNPs with minimum count of 3 for the individuals::
101
102 Contig161_chr1_4641264_4641879 115 C T 73.5 chr1 4641382 C 6 0 2 45 8 0 2 51 15 0 2 72 5 0 2 42 6 0 2 45 10 0 2 57 Y 54 0.323 0
103 Contig20_chr1_21313469_21313570 66 C T 54.0 chr1 21313534 C 4 0 2 39 4 0 2 39 5 0 2 42 4 0 2 39 4 0 2 39 5 0 2 42 N 1 +99. 0
104 etc.
105
106 </help>
107 </tool>