annotate offspring_heterozygosity.xml @ 32:03c22b722882

remove BeautifulSoup dependency
author Richard Burhans <burhans@bx.psu.edu>
date Fri, 20 Sep 2013 13:54:23 -0400
parents a631c2f6d913
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
31
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
1 <tool id="gd_offspring_heterozygosity" name="Pairs sequenced" version="1.0.0">
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
2 <description>: Offspring estimated heterozygosity of sequenced pairs</description>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
3
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
4 <command interpreter="python">
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
5 #import json
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
6 #import base64
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
7 #import zlib
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
8 #set $ind_names = $input.dataset.metadata.individual_names
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
9 #set $ind_colms = $input.dataset.metadata.individual_columns
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
10 #set $ind_dict = dict(zip($ind_names, $ind_colms))
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
11 #set $ind_json = json.dumps($ind_dict, separators=(',',':'))
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
12 #set $ind_comp = zlib.compress($ind_json, 9)
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
13 #set $ind_arg = base64.b64encode($ind_comp)
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
14 offspring_heterozygosity.py '$input' '$input.ext' '$ind_arg' '$p1_input' '$p2_input' '$output'
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
15 </command>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
16
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
17 <inputs>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
18 <param name="input" type="data" format="gd_snp,gd_genotype" label="SNP dataset" />
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
19 <param name="p1_input" type="data" format="gd_indivs" label="First individuals dataset" />
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
20 <param name="p2_input" type="data" format="gd_indivs" label="Second individuals dataset" />
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
21 </inputs>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
22
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
23 <outputs>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
24 <data name="output" format="txt" />
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
25 </outputs>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
26
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
27 <requirements>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
28 <requirement type="package" version="0.1">gd_c_tools</requirement>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
29 </requirements>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
30
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
31 <!--
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
32 <tests>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
33 </tests>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
34 -->
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
35
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
36 <help>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
37
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
38 **Dataset formats**
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
39
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
40 The input datasets are in gd_snp_, gd_genotype_, and gd_indivs_ formats.
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
41 The output dataset is in text_ format.
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
42
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
43 .. _gd_snp: ./static/formatHelp.html#gd_snp
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
44 .. _gd_genotype: ./static/formatHelp.html#gd_genotype
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
45 .. _gd_indivs: ./static/formatHelp.html#gd_indivs
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
46 .. _text: ./static/formatHelp.html#text
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
47
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
48 -----
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
49
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
50 **What it does**
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
51
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
52 For each pair of individuals, one from each specified set, the program
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
53 computes the expected heterozygosity of any offspring of the pair, i.e.,
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
54 the probability that the offspring has distinct nucleotides at a randomly
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
55 chosen autosomal SNP. In other words, we add the following numbers for
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
56 each autosomal SNP where both genotypes are defined, then divide by the
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
57 number of those SNPs:
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
58
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
59 0 if the individuals are homozygous for the same nucleotide
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
60
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
61 1 if the individuals are homozygous for different nucleotides
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
62
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
63 1/2 otherwise (i.e., if one or both individuals are heterozygous)
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
64
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
65 A SNP is ignored if one or both individuals have an undefined genotype
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
66 (designated as -1).
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
67 </help>
a631c2f6d913 Update to Miller Lab devshed revision 3c4110ffacc3
Richard Burhans <burhans@bx.psu.edu>
parents:
diff changeset
68 </tool>