comparison offspring_heterozygosity.xml @ 31:a631c2f6d913

Update to Miller Lab devshed revision 3c4110ffacc3
author Richard Burhans <burhans@bx.psu.edu>
date Fri, 20 Sep 2013 13:25:27 -0400
parents
children
comparison
equal deleted inserted replaced
30:4188853b940b 31:a631c2f6d913
1 <tool id="gd_offspring_heterozygosity" name="Pairs sequenced" version="1.0.0">
2 <description>: Offspring estimated heterozygosity of sequenced pairs</description>
3
4 <command interpreter="python">
5 #import json
6 #import base64
7 #import zlib
8 #set $ind_names = $input.dataset.metadata.individual_names
9 #set $ind_colms = $input.dataset.metadata.individual_columns
10 #set $ind_dict = dict(zip($ind_names, $ind_colms))
11 #set $ind_json = json.dumps($ind_dict, separators=(',',':'))
12 #set $ind_comp = zlib.compress($ind_json, 9)
13 #set $ind_arg = base64.b64encode($ind_comp)
14 offspring_heterozygosity.py '$input' '$input.ext' '$ind_arg' '$p1_input' '$p2_input' '$output'
15 </command>
16
17 <inputs>
18 <param name="input" type="data" format="gd_snp,gd_genotype" label="SNP dataset" />
19 <param name="p1_input" type="data" format="gd_indivs" label="First individuals dataset" />
20 <param name="p2_input" type="data" format="gd_indivs" label="Second individuals dataset" />
21 </inputs>
22
23 <outputs>
24 <data name="output" format="txt" />
25 </outputs>
26
27 <requirements>
28 <requirement type="package" version="0.1">gd_c_tools</requirement>
29 </requirements>
30
31 <!--
32 <tests>
33 </tests>
34 -->
35
36 <help>
37
38 **Dataset formats**
39
40 The input datasets are in gd_snp_, gd_genotype_, and gd_indivs_ formats.
41 The output dataset is in text_ format.
42
43 .. _gd_snp: ./static/formatHelp.html#gd_snp
44 .. _gd_genotype: ./static/formatHelp.html#gd_genotype
45 .. _gd_indivs: ./static/formatHelp.html#gd_indivs
46 .. _text: ./static/formatHelp.html#text
47
48 -----
49
50 **What it does**
51
52 For each pair of individuals, one from each specified set, the program
53 computes the expected heterozygosity of any offspring of the pair, i.e.,
54 the probability that the offspring has distinct nucleotides at a randomly
55 chosen autosomal SNP. In other words, we add the following numbers for
56 each autosomal SNP where both genotypes are defined, then divide by the
57 number of those SNPs:
58
59 0 if the individuals are homozygous for the same nucleotide
60
61 1 if the individuals are homozygous for different nucleotides
62
63 1/2 otherwise (i.e., if one or both individuals are heterozygous)
64
65 A SNP is ignored if one or both individuals have an undefined genotype
66 (designated as -1).
67 </help>
68 </tool>