Mercurial > repos > miller-lab > genome_diversity
comparison phylogenetic_tree.xml @ 18:f04f40a36cc8
Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
author | Richard Burhans <burhans@bx.psu.edu> |
---|---|
date | Tue, 23 Oct 2012 12:41:52 -0400 |
parents | 8ae67e9fb6ff |
children | 248b06e86022 |
comparison
equal
deleted
inserted
replaced
17:a3af29edcce2 | 18:f04f40a36cc8 |
---|---|
29 | 29 |
30 <inputs> | 30 <inputs> |
31 <param name="input" type="data" format="gd_snp" label="SNP dataset" /> | 31 <param name="input" type="data" format="gd_snp" label="SNP dataset" /> |
32 | 32 |
33 <conditional name="individuals"> | 33 <conditional name="individuals"> |
34 <param name="choice" type="select" label="Individuals"> | 34 <param name="choice" type="select" label="Compute for"> |
35 <option value="0" selected="true">All individuals</option> | 35 <option value="0" selected="true">All individuals</option> |
36 <option value="1">Individuals in a population</option> | 36 <option value="1">Individuals in a population</option> |
37 </param> | 37 </param> |
38 <when value="0" /> | 38 <when value="0" /> |
39 <when value="1"> | 39 <when value="1"> |
40 <param name="p1_input" type="data" format="gd_indivs" label="Population individuals" /> | 40 <param name="p1_input" type="data" format="gd_indivs" label="Population individuals" /> |
41 </when> | 41 </when> |
42 </conditional> | 42 </conditional> |
43 | 43 |
44 <param name="minimum_coverage" type="integer" min="0" value="0" label="Minimum coverage" /> | 44 <param name="minimum_coverage" type="integer" min="0" value="0" label="Minimum SNP coverage" /> |
45 | 45 |
46 <param name="minimum_quality" type="integer" min="0" value="0" label="Minimum quality" help="Note: minimum coverage and minimum quality cannot both be 0" /> | 46 <param name="minimum_quality" type="integer" min="0" value="0" label="Minimum SNP quality" |
47 help="Note: minimum coverage and minimum quality cannot both be 0" /> | |
47 | 48 |
48 <param name="include_reference" type="select" format="integer" label="Include reference sequence"> | 49 <param name="include_reference" type="select" format="integer" label="Include reference sequence"> |
49 <option value="1" selected="true">Yes</option> | 50 <option value="1" selected="true">Yes</option> |
50 <option value="0">No</option> | 51 <option value="0">No</option> |
51 </param> | 52 </param> |
52 | 53 |
53 <param name="data_source" type="select" format="integer" label="Data source"> | 54 <param name="data_source" type="select" format="integer" label="Distance metric"> |
54 <option value="0" selected="true">sequence coverage</option> | 55 <option value="0" selected="true">sequence coverage</option> |
55 <option value="1">estimated genotype</option> | 56 <option value="1">estimated genotype</option> |
56 </param> | 57 </param> |
57 | 58 |
58 <param name="branch_style" type="select" display="radio"> | 59 <param name="branch_style" type="select" display="radio"> |
131 The informative SNPs can be used as a guide to how reliable the tree is. | 132 The informative SNPs can be used as a guide to how reliable the tree is. |
132 | 133 |
133 The input parameters are: | 134 The input parameters are: |
134 | 135 |
135 SNP dataset | 136 SNP dataset |
136 A table of SNPs for various individuals, in gd_snp format. | 137 A table of SNPs for various individuals, in gd_snp format. |
137 | 138 |
138 Individuals | 139 Individuals |
139 By default all individuals are included in the analysis, but this can | 140 By default all individuals are included in the analysis, but this can |
140 optionally be restricted to a subset that has been defined using the | 141 optionally be restricted to a subset that has been defined using the |
141 Specify Individuals tool. | 142 Specify Individuals tool. |
142 | 143 |
143 Minimum coverage | 144 Minimum SNP coverage |
144 For each pair of individuals, the tool looks for informative SNPs, i.e., | 145 For each pair of individuals, the tool looks for informative SNPs, i.e., |
145 where the sequence data for both individuals is adequate according to | 146 where the sequence data for both individuals is adequate. Specifying, |
146 some criterion. Specifying, say, 7 for this option instructs the tool | 147 say, 7 for this option instructs the tool to consider only SNPs with |
147 to consider only SNPs with coverage at least 7 in both individuals | 148 at least 7 reads in each of the two individuals (regardless of the |
148 when estimating their "genetic distance". | 149 alleles) when estimating their genetic distance. |
149 | 150 |
150 Minimum quality | 151 Minimum SNP quality |
151 Specifying, say, 37 for this option instructs the tool to consider | 152 Specifying, say, 37 for this option instructs the tool to consider |
152 only SNPs with SAMtools quality value at least 37 in both individuals | 153 only SNPs with a quality score of at least 37 in both individuals |
153 when estimating their "genetic distance". | 154 when estimating their genetic distance. |
154 | 155 |
155 Include reference sequence | 156 Include reference sequence |
156 For gd_snp datasets containing columns for a reference sequence, the | 157 For gd_snp datasets containing columns for a reference sequence, the |
157 user can ask that the reference be indicated in the tree, to help with | 158 user can ask that the reference be indicated in the tree, to help with |
158 rooting it. If the dataset has no reference columns, this option has | 159 rooting it. If the dataset has no reference columns, this option has |
159 no effect. | 160 no effect. |
160 | 161 |
161 Data source | 162 Distance metric |
162 The genetic distance between two individuals at a given SNP can | 163 The genetic distance between two individuals at a given SNP can |
163 be estimated two ways. One method is to use the absolute value of the | 164 be estimated two ways. One method is to use the absolute value of the |
164 difference in the frequency of the first allele (or equivalently, the | 165 difference in the frequency of the first allele (or equivalently, the |
165 second allele). For instance, if the first individual has 5 reads of | 166 second allele). For instance, if the first individual has 5 reads of |
166 each allele and the second individual has respectively 3 and 6 reads, | 167 each allele and the second individual has respectively 3 and 6 reads, |
167 then the frequencies are 1/2 and 1/3, giving a distance 1/6 at that | 168 then the frequencies are 1/2 and 1/3, giving a distance 1/6 at that |
168 SNP. The other approach is to use the SAMtools genotypes to estimate | 169 SNP. The other approach is to use the genotype calls to estimate |
169 the difference in the number of occurrences of the first allele. | 170 the difference in the number of occurrences of the first allele. |
170 For instance, if the two genotypes are 2 and 1, i.e., the individuals | 171 For instance, if the two genotypes are 2 and 1, i.e., the individuals |
171 are estimated to have respectively 2 and 1 occurrences of the first | 172 are estimated to have respectively 2 and 1 occurrences of the first |
172 allele at this location, then the distance is 1 (the absolute value | 173 allele at this location, then the distance is 1 (the absolute value |
173 of the difference of the two numbers). | 174 of the difference of the two numbers). |
174 | 175 |
175 Output options | 176 Output options |
176 The final four options apply mostly to the graphical drawing of the | 177 The final four options apply mostly to the graphical drawing of the |
177 tree, except that the branch lengths are also added to the Newick text | 178 tree, except that the branch lengths are also added to the Newick text |
178 file. | 179 file. |
179 | 180 |
180 ----- | 181 ----- |
181 | 182 |
182 **Acknowledgments** | 183 **Acknowledgments** |
183 | 184 |