annotate add_fst_column.xml @ 18:f04f40a36cc8

Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
author Richard Burhans <burhans@bx.psu.edu>
date Tue, 23 Oct 2012 12:41:52 -0400
parents 8ae67e9fb6ff
children d6b961721037
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
18
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
1 <tool id="gd_add_fst_column" name="Per-SNP FSTs" version="1.1.0">
14
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
2 <description>: Compute a fixation index score for each SNP</description>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
3
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
4 <command interpreter="python">
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
5 add_fst_column.py "$input" "$p1_input" "$p2_input" "$data_source" "$min_reads" "$min_qual" "$retain" "$discard_fixed" "$biased" "$output"
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
6 #for $individual, $individual_col in zip($input.dataset.metadata.individual_names, $input.dataset.metadata.individual_columns)
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
7 #set $arg = '%s:%s' % ($individual_col, $individual)
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
8 "$arg"
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
9 #end for
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
10 </command>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
11
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
12 <inputs>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
13 <param name="input" type="data" format="gd_snp" label="SNP table" />
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
14 <param name="p1_input" type="data" format="gd_indivs" label="Population 1 individuals" />
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
15 <param name="p2_input" type="data" format="gd_indivs" label="Population 2 individuals" />
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
16
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
17 <param name="data_source" type="select" format="integer" label="Data source">
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
18 <option value="0" selected="true">sequence coverage</option>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
19 <option value="1">estimated genotype</option>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
20 </param>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
21
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
22 <param name="min_reads" type="integer" min="0" value="0" label="Minimum total read count for a population" />
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
23 <param name="min_qual" type="integer" min="0" value="0" label="Minimum individual genotype quality" />
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
24
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
25 <param name="retain" type="select" label="Special treatment">
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
26 <option value="0" selected="true">Skip row</option>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
27 <option value="1">Set FST = -1</option>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
28 </param>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
29
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
30 <param name="discard_fixed" type="select" label="Apparently fixed SNPs">
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
31 <option value="0">Retain SNPs that appear fixed in the two populations</option>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
32 <option value="1" selected="true">Delete SNPs that appear fixed in the two populations</option>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
33 </param>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
34
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
35 <param name="biased" type="select" label="FST estimator">
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
36 <option value="0" selected="true">Wright's original definition</option>
18
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
37 <option value="1">The Weir-Cockerham estimator</option>
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
38 <option value="2">The Reich-Patterson estimator</option>
14
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
39 </param>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
40
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
41 </inputs>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
42
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
43 <outputs>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
44 <data name="output" format="gd_snp" metadata_source="input" />
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
45 </outputs>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
46
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
47 <tests>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
48 <test>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
49 <param name="input" value="test_in/sample.gd_snp" ftype="gd_snp" />
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
50 <param name="p1_input" value="test_in/a.gd_indivs" ftype="gd_indivs" />
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
51 <param name="p2_input" value="test_in/b.gd_indivs" ftype="gd_indivs" />
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
52 <param name="data_source" value="0" />
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
53 <param name="min_reads" value="3" />
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
54 <param name="min_qual" value="0" />
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
55 <param name="retain" value="0" />
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
56 <param name="discard_fixed" value="1" />
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
57 <param name="biased" value="0" />
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
58 <output name="output" file="test_out/add_fst_column/add_fst_column.gd_snp" />
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
59 </test>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
60 </tests>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
61
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
62 <help>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
63
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
64 **What it does**
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
65
18
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
66 The user specifies a SNP table and two "populations" of individuals, both previously defined using the Galaxy tool to specify individuals from a SNP table. No individual can be in both populations. Other choices are as follows.
14
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
67
18
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
68 Data source. The allele frequencies of a SNP in the two populations can be estimated either by the total number of reads of each allele, or by adding the frequencies inferred from genotypes of individuals in the populations.
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
69
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
70 After specifying the data source, the user sets lower bounds on amount of data required at a SNP. For estimating the Fst using read counts, the bound is the minimum count of reads of the two alleles in a population. For estimations based on genotype, the bound is the minimum reported genotype quality per individual.
14
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
71
18
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
72 The user specifies whether the SNPs that violate the lower bound should be ignored or the Fst set to -1.
14
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
73
18
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
74 The user specifies whether SNPs where both populations appear to be fixed for the same allele should be retained or discarded.
14
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
75
18
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
76 Finally, the user chooses which definition of Fst to use: Wright's original definition, the Weir-Cockerham unbiased estimator, or the Reich-Patterson estimator.
14
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
77
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
78 A column is appended to the SNP table giving the Fst for each retained SNP.
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
79
18
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
80 References:
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
81
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
82 Sewall Wright (1951) The genetical structure of populations. Ann Eugen 15:323-354.
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
83
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
84 B. S. Weir and C. Clark Cockerham (1984) Estimating F-statistics for the analysis of population structure. Evolution 38:1358-1370.
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
85
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
86 Weir, B.S. 1996. Population substructure. Genetic data analysis II, pp. 161-173. Sinauer Associates, Sundand, MA.
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
87
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
88 David Reich, Kumarasamy Thangaraj, Nick Patterson, Alkes L. Price, and Lalji Singh (2009) Reconstructing Indian population history. Nature 461:489-494, especially Supplement 2.
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
89
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
90 Their effectiveness for computing FSTs when there are many SNPs but few individuals is discussed in the followoing paper.
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
91
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
92 Eva-Maria Willing, Christine Dreyer, Cock van Oosterhout (2012) Estimates of genetic differentiation measured by FST do not necessarily require large sample sizes when using many SNP markers. PLoS One 7:e42649.
f04f40a36cc8 Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
Richard Burhans <burhans@bx.psu.edu>
parents: 14
diff changeset
93
14
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
94 </help>
8ae67e9fb6ff Uploaded Miller Lab Devshed version a51c894f5bed again [possible toolshed.g2 bug]
miller-lab
parents:
diff changeset
95 </tool>