comparison add_fst_column.xml @ 18:f04f40a36cc8

Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
author Richard Burhans <burhans@bx.psu.edu>
date Tue, 23 Oct 2012 12:41:52 -0400
parents 8ae67e9fb6ff
children d6b961721037
comparison
equal deleted inserted replaced
17:a3af29edcce2 18:f04f40a36cc8
1 <tool id="gd_add_fst_column" name="Per-SNP FSTs" version="1.0.0"> 1 <tool id="gd_add_fst_column" name="Per-SNP FSTs" version="1.1.0">
2 <description>: Compute a fixation index score for each SNP</description> 2 <description>: Compute a fixation index score for each SNP</description>
3 3
4 <command interpreter="python"> 4 <command interpreter="python">
5 add_fst_column.py "$input" "$p1_input" "$p2_input" "$data_source" "$min_reads" "$min_qual" "$retain" "$discard_fixed" "$biased" "$output" 5 add_fst_column.py "$input" "$p1_input" "$p2_input" "$data_source" "$min_reads" "$min_qual" "$retain" "$discard_fixed" "$biased" "$output"
6 #for $individual, $individual_col in zip($input.dataset.metadata.individual_names, $input.dataset.metadata.individual_columns) 6 #for $individual, $individual_col in zip($input.dataset.metadata.individual_names, $input.dataset.metadata.individual_columns)
32 <option value="1" selected="true">Delete SNPs that appear fixed in the two populations</option> 32 <option value="1" selected="true">Delete SNPs that appear fixed in the two populations</option>
33 </param> 33 </param>
34 34
35 <param name="biased" type="select" label="FST estimator"> 35 <param name="biased" type="select" label="FST estimator">
36 <option value="0" selected="true">Wright's original definition</option> 36 <option value="0" selected="true">Wright's original definition</option>
37 <option value="1">Weir's unbiased estimator</option> 37 <option value="1">The Weir-Cockerham estimator</option>
38 <option value="2">The Reich-Patterson estimator</option>
38 </param> 39 </param>
39 40
40 </inputs> 41 </inputs>
41 42
42 <outputs> 43 <outputs>
60 61
61 <help> 62 <help>
62 63
63 **What it does** 64 **What it does**
64 65
65 The user specifies a SNP table and two "populations" of individuals, 66 The user specifies a SNP table and two "populations" of individuals, both previously defined using the Galaxy tool to specify individuals from a SNP table. No individual can be in both populations. Other choices are as follows.
66 both previously defined using the Specify Individuals tool.
67 No individual can be in both populations. Other choices are as follows.
68 67
69 Data source. The allele frequencies of a SNP in the two populations can be 68 Data source. The allele frequencies of a SNP in the two populations can be estimated either by the total number of reads of each allele, or by adding the frequencies inferred from genotypes of individuals in the populations.
70 estimated either by the total number of reads of each allele, or by adding
71 the frequencies inferred from genotypes of individuals in the populations.
72 69
73 After specifying the data source, the user sets lower bounds on amount 70 After specifying the data source, the user sets lower bounds on amount of data required at a SNP. For estimating the Fst using read counts, the bound is the minimum count of reads of the two alleles in a population. For estimations based on genotype, the bound is the minimum reported genotype quality per individual.
74 of data required at a SNP. For estimating the Fst using read counts,
75 the bound is the minimum count of reads of the two alleles in a population.
76 For estimations based on genotype, the bound is the minimum reported genotype
77 quality per individual.
78 71
79 The user specifies whether the SNPs that violate the lower bound should be 72 The user specifies whether the SNPs that violate the lower bound should be ignored or the Fst set to -1.
80 ignored or the Fst set to -1.
81 73
82 The user specifies whether SNPs where both populations appear to be fixed 74 The user specifies whether SNPs where both populations appear to be fixed for the same allele should be retained or discarded.
83 for the same allele should be retained or discarded.
84 75
85 Finally, the user chooses which definition of Fst to use: Wright's original 76 Finally, the user chooses which definition of Fst to use: Wright's original definition, the Weir-Cockerham unbiased estimator, or the Reich-Patterson estimator.
86 definition or Weir's unbiased estimator.
87 77
88 A column is appended to the SNP table giving the Fst for each retained SNP. 78 A column is appended to the SNP table giving the Fst for each retained SNP.
89 79
80 References:
81
82 Sewall Wright (1951) The genetical structure of populations. Ann Eugen 15:323-354.
83
84 B. S. Weir and C. Clark Cockerham (1984) Estimating F-statistics for the analysis of population structure. Evolution 38:1358-1370.
85
86 Weir, B.S. 1996. Population substructure. Genetic data analysis II, pp. 161-173. Sinauer Associates, Sundand, MA.
87
88 David Reich, Kumarasamy Thangaraj, Nick Patterson, Alkes L. Price, and Lalji Singh (2009) Reconstructing Indian population history. Nature 461:489-494, especially Supplement 2.
89
90 Their effectiveness for computing FSTs when there are many SNPs but few individuals is discussed in the followoing paper.
91
92 Eva-Maria Willing, Christine Dreyer, Cock van Oosterhout (2012) Estimates of genetic differentiation measured by FST do not necessarily require large sample sizes when using many SNP markers. PLoS One 7:e42649.
93
90 </help> 94 </help>
91 </tool> 95 </tool>