comparison phylogenetic_tree.xml @ 18:f04f40a36cc8

Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
author Richard Burhans <burhans@bx.psu.edu>
date Tue, 23 Oct 2012 12:41:52 -0400
parents 8ae67e9fb6ff
children 248b06e86022
comparison
equal deleted inserted replaced
17:a3af29edcce2 18:f04f40a36cc8
29 29
30 <inputs> 30 <inputs>
31 <param name="input" type="data" format="gd_snp" label="SNP dataset" /> 31 <param name="input" type="data" format="gd_snp" label="SNP dataset" />
32 32
33 <conditional name="individuals"> 33 <conditional name="individuals">
34 <param name="choice" type="select" label="Individuals"> 34 <param name="choice" type="select" label="Compute for">
35 <option value="0" selected="true">All individuals</option> 35 <option value="0" selected="true">All individuals</option>
36 <option value="1">Individuals in a population</option> 36 <option value="1">Individuals in a population</option>
37 </param> 37 </param>
38 <when value="0" /> 38 <when value="0" />
39 <when value="1"> 39 <when value="1">
40 <param name="p1_input" type="data" format="gd_indivs" label="Population individuals" /> 40 <param name="p1_input" type="data" format="gd_indivs" label="Population individuals" />
41 </when> 41 </when>
42 </conditional> 42 </conditional>
43 43
44 <param name="minimum_coverage" type="integer" min="0" value="0" label="Minimum coverage" /> 44 <param name="minimum_coverage" type="integer" min="0" value="0" label="Minimum SNP coverage" />
45 45
46 <param name="minimum_quality" type="integer" min="0" value="0" label="Minimum quality" help="Note: minimum coverage and minimum quality cannot both be 0" /> 46 <param name="minimum_quality" type="integer" min="0" value="0" label="Minimum SNP quality"
47 help="Note: minimum coverage and minimum quality cannot both be 0" />
47 48
48 <param name="include_reference" type="select" format="integer" label="Include reference sequence"> 49 <param name="include_reference" type="select" format="integer" label="Include reference sequence">
49 <option value="1" selected="true">Yes</option> 50 <option value="1" selected="true">Yes</option>
50 <option value="0">No</option> 51 <option value="0">No</option>
51 </param> 52 </param>
52 53
53 <param name="data_source" type="select" format="integer" label="Data source"> 54 <param name="data_source" type="select" format="integer" label="Distance metric">
54 <option value="0" selected="true">sequence coverage</option> 55 <option value="0" selected="true">sequence coverage</option>
55 <option value="1">estimated genotype</option> 56 <option value="1">estimated genotype</option>
56 </param> 57 </param>
57 58
58 <param name="branch_style" type="select" display="radio"> 59 <param name="branch_style" type="select" display="radio">
131 The informative SNPs can be used as a guide to how reliable the tree is. 132 The informative SNPs can be used as a guide to how reliable the tree is.
132 133
133 The input parameters are: 134 The input parameters are:
134 135
135 SNP dataset 136 SNP dataset
136 A table of SNPs for various individuals, in gd_snp format. 137 A table of SNPs for various individuals, in gd_snp format.
137 138
138 Individuals 139 Individuals
139 By default all individuals are included in the analysis, but this can 140 By default all individuals are included in the analysis, but this can
140 optionally be restricted to a subset that has been defined using the 141 optionally be restricted to a subset that has been defined using the
141 Specify Individuals tool. 142 Specify Individuals tool.
142 143
143 Minimum coverage 144 Minimum SNP coverage
144 For each pair of individuals, the tool looks for informative SNPs, i.e., 145 For each pair of individuals, the tool looks for informative SNPs, i.e.,
145 where the sequence data for both individuals is adequate according to 146 where the sequence data for both individuals is adequate. Specifying,
146 some criterion. Specifying, say, 7 for this option instructs the tool 147 say, 7 for this option instructs the tool to consider only SNPs with
147 to consider only SNPs with coverage at least 7 in both individuals 148 at least 7 reads in each of the two individuals (regardless of the
148 when estimating their "genetic distance". 149 alleles) when estimating their genetic distance.
149 150
150 Minimum quality 151 Minimum SNP quality
151 Specifying, say, 37 for this option instructs the tool to consider 152 Specifying, say, 37 for this option instructs the tool to consider
152 only SNPs with SAMtools quality value at least 37 in both individuals 153 only SNPs with a quality score of at least 37 in both individuals
153 when estimating their "genetic distance". 154 when estimating their genetic distance.
154 155
155 Include reference sequence 156 Include reference sequence
156 For gd_snp datasets containing columns for a reference sequence, the 157 For gd_snp datasets containing columns for a reference sequence, the
157 user can ask that the reference be indicated in the tree, to help with 158 user can ask that the reference be indicated in the tree, to help with
158 rooting it. If the dataset has no reference columns, this option has 159 rooting it. If the dataset has no reference columns, this option has
159 no effect. 160 no effect.
160 161
161 Data source 162 Distance metric
162 The genetic distance between two individuals at a given SNP can 163 The genetic distance between two individuals at a given SNP can
163 be estimated two ways. One method is to use the absolute value of the 164 be estimated two ways. One method is to use the absolute value of the
164 difference in the frequency of the first allele (or equivalently, the 165 difference in the frequency of the first allele (or equivalently, the
165 second allele). For instance, if the first individual has 5 reads of 166 second allele). For instance, if the first individual has 5 reads of
166 each allele and the second individual has respectively 3 and 6 reads, 167 each allele and the second individual has respectively 3 and 6 reads,
167 then the frequencies are 1/2 and 1/3, giving a distance 1/6 at that 168 then the frequencies are 1/2 and 1/3, giving a distance 1/6 at that
168 SNP. The other approach is to use the SAMtools genotypes to estimate 169 SNP. The other approach is to use the genotype calls to estimate
169 the difference in the number of occurrences of the first allele. 170 the difference in the number of occurrences of the first allele.
170 For instance, if the two genotypes are 2 and 1, i.e., the individuals 171 For instance, if the two genotypes are 2 and 1, i.e., the individuals
171 are estimated to have respectively 2 and 1 occurrences of the first 172 are estimated to have respectively 2 and 1 occurrences of the first
172 allele at this location, then the distance is 1 (the absolute value 173 allele at this location, then the distance is 1 (the absolute value
173 of the difference of the two numbers). 174 of the difference of the two numbers).
174 175
175 Output options 176 Output options
176 The final four options apply mostly to the graphical drawing of the 177 The final four options apply mostly to the graphical drawing of the
177 tree, except that the branch lengths are also added to the Newick text 178 tree, except that the branch lengths are also added to the Newick text
178 file. 179 file.
179 180
180 ----- 181 -----
181 182
182 **Acknowledgments** 183 **Acknowledgments**
183 184