Which site(s) in an alignment evolve towards to or away from a particular residue.
Screen protein sequence alignments where the direction of evolution can be resolved (via tree rooting, e.g. using an outgroup) to find sites which evolve differently from a standard protein model (selected by the user), or a gene-average model (GTR) to find evidence of directional selection.
FADE (FUBAR Approach to Directional Evolution) is a fast method to test whether or not a subset of sites in a protein alignment evolve towards a particular residue along a subset of branches at accelerated rates compared to a reference model. FADE uses a random effects model and latent Dirichlet allocation (LDA)-inspired approximation methods to allocate sites to rate classes.
The intuition behind FADE is to detect directional selection, where amino acid substitutions are consistently biased towards a particular residue type. This can be indicative of adaptation to new functional constraints or environments. By comparing the observed substitution patterns to a null model (e.g., a standard protein substitution model or a gene-average model), FADE identifies sites that exhibit significant directional bias in their evolutionary trajectory.
Note: the names of sequences in the alignment must match the names of the sequences in the tree.
A JSON file with analysis results (http://hyphy.org/resources/json-fields.pdf).
A custom visualization module for viewing these results is available (see http://vision.hyphy.org/FADE for an example)
--model The baseline substitution model to use
[default] use GTR
--branches Which branches should be tested for selection?
All [default] : test all branches
Internal : test only internal branches (suitable for
intra-host pathogen evolution for example, where terminal branches
may contain polymorphism data)
Leaves: test only terminal (leaf) branches
Unlabeled: if the Newick string is labeled using the {} notation,
test only branches without explicit labels
(see http://hyphy.org/tutorials/phylotree/)
--grid The number of grid points
Smaller : faster
Larger : more precise posterior estimation but slower
default value: 20
--method Inference method to use
Variational-Bayes : 0-th order Variational Bayes approximation; fastest [default]
Metropolis-Hastings : Full Metropolis-Hastings MCMC algorithm; orignal method [slowest]
Collapsed-Gibbs : Collapsed Gibbs sampler [intermediate speed]
--chains How many MCMC chains to run (does not apply to Variational-Bayes)
default value: 5
--chain-length MCMC chain length (does not apply to Variational-Bayes)
default value: 2,000,000
--burn-in MCMC chain burn in (does not apply to Variational-Bayes)
default value: 1,000,000
--samples MCMC samples to draw (does not apply to Variational-Bayes)
default value: 1,000
--concentration_parameter
The concentration parameter of the Dirichlet prior
default value: 0.5
Output
------
A JSON file with analysis results (http://hyphy.org/resources/json-fields.pdf).
A custom visualization module for viewing these results is available (see http://vision.hyphy.org/FADE for an example)
A Markdown file with a summary of the analysis.