annotate diversity_pi.xml @ 35:ea52b23f1141

Bug fixes for Draw variants, Phylip, and gd_d_tools
author Richard Burhans <burhans@bx.psu.edu>
date Wed, 20 Nov 2013 13:46:10 -0500
parents a631c2f6d913
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
31
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
1 <tool id="gd_diversity_pi" name="Diversity" version="1.1.0">
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
2 <description>: pi, allowing for unsequenced intervals</description>
24
248b06e86022 Added gd_genotype datatype. Modified tools to support new datatype.
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
3
248b06e86022 Added gd_genotype datatype. Modified tools to support new datatype.
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
4 <command interpreter="python">
27
8997f2ca8c7a Update to Miller Lab devshed revision bae0d3306d3b
Richard Burhans <burhans@bx.psu.edu>
parents: 24
diff changeset
5 #import json
8997f2ca8c7a Update to Miller Lab devshed revision bae0d3306d3b
Richard Burhans <burhans@bx.psu.edu>
parents: 24
diff changeset
6 #import base64
8997f2ca8c7a Update to Miller Lab devshed revision bae0d3306d3b
Richard Burhans <burhans@bx.psu.edu>
parents: 24
diff changeset
7 #import zlib
31
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
8 #set $snp_names = $input.dataset.metadata.individual_names
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
9 #set $snp_colms = $input.dataset.metadata.individual_columns
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
10 #set $snp_dict = dict(zip($snp_names, $snp_colms))
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
11 #set $snp_json = json.dumps($snp_dict, separators=(',',':'))
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
12 #set $snp_comp = zlib.compress($snp_json, 9)
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
13 #set $snp_arg = base64.b64encode($snp_comp)
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
14 #if $use_cov.choice == '1'
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
15 #set $cov_file = $use_cov.cov_input
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
16 #set $cov_ext = $use_cov.cov_input.ext
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
17 #set $cov_names = $use_cov.cov_input.dataset.metadata.individual_names
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
18 #set $cov_colms = $use_cov.cov_input.dataset.metadata.individual_columns
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
19 #set $cov_dict = dict(zip($cov_names, $cov_colms))
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
20 #set $cov_json = json.dumps($cov_dict, separators=(',',':'))
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
21 #set $cov_comp = zlib.compress($cov_json, 9)
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
22 #set $cov_arg = base64.b64encode($cov_comp)
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
23 #set $cov_min = $use_cov.min_coverage
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
24 #set $cov_req = $use_cov.req_thresh
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
25 #else
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
26 #set $cov_file = '/dev/null'
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
27 #set $cov_ext = ''
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
28 #set $cov_arg = ''
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
29 #set $cov_min = 0
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
30 #set $cov_req = 0
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
31 #end if
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
32 diversity_pi.py '$input' '$input.ext' '$snp_arg' '$cov_file' '$cov_ext' '$cov_arg' '$indiv_input' '$cov_min' '$cov_req' '$output'
24
248b06e86022 Added gd_genotype datatype. Modified tools to support new datatype.
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
33 </command>
248b06e86022 Added gd_genotype datatype. Modified tools to support new datatype.
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
34
248b06e86022 Added gd_genotype datatype. Modified tools to support new datatype.
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
35 <inputs>
31
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
36 <param name="input" type="data" format="gd_snp,gd_genotype" label="SNP/Genotype dataset" />
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
37 <conditional name="use_cov">
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
38 <param name="choice" type="select" format="integer" label="Include Coverage dataset">
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
39 <option value="1" selected="true">yes</option>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
40 <option value="0">no</option>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
41 </param>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
42 <when value="0" />
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
43 <when value="1">
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
44 <param name="cov_input" type="data" format="gd_snp,gd_genotype" label="Coverage dataset" />
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
45 <param name="min_coverage" type="integer" min="1" value="1" label="Minimum coverage" />
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
46 <param name="req_thresh" type="integer" min="1" value="1" label="Lower bound for shared well-covered bp" />
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
47 </when>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
48 </conditional>
24
248b06e86022 Added gd_genotype datatype. Modified tools to support new datatype.
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
49 <param name="indiv_input" type="data" format="gd_indivs" label="Population Individuals" />
248b06e86022 Added gd_genotype datatype. Modified tools to support new datatype.
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
50 </inputs>
248b06e86022 Added gd_genotype datatype. Modified tools to support new datatype.
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
51
248b06e86022 Added gd_genotype datatype. Modified tools to support new datatype.
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
52 <outputs>
248b06e86022 Added gd_genotype datatype. Modified tools to support new datatype.
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
53 <data name="output" format="txt" metadata_source="input" />
248b06e86022 Added gd_genotype datatype. Modified tools to support new datatype.
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
54 </outputs>
248b06e86022 Added gd_genotype datatype. Modified tools to support new datatype.
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
55
31
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
56 <requirements>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
57 <requirement type="package" version="0.1">gd_c_tools</requirement>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
58 </requirements>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
59
24
248b06e86022 Added gd_genotype datatype. Modified tools to support new datatype.
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
60 <help>
31
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
61 **What it does**
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
62
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
63 The user supplies the following:
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
64
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
65 1. A file in gd_genotype or gd_snp format giving the mitochondrial SNPs.
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
66 2. An optional gd_genotype file gives the sequence coverage for each individual at each mitochondrial position.
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
67 3. A set of individuals specified with the "Specify individuals" tool.
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
68 4. The minimum depth of sequence coverage. Positions where an individual has less coverage are ignored.
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
69 5. The number of adequately covered positions that must be shared by two individuals before their diversity is included in the reported average.
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
70
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents: 27
diff changeset
71 For each pair of individual (with adequate shared coverage), the program divides the number of nucleotide difference between the individuals in those intervals by the intervals' total length. Those ratios are averaged over the relevant pairs of individuals.
24
248b06e86022 Added gd_genotype datatype. Modified tools to support new datatype.
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
72 </help>
248b06e86022 Added gd_genotype datatype. Modified tools to support new datatype.
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
73 </tool>