FEL (Fixed Effects Likelihood) is a statistical method used to identify individual sites in a gene that are subject to pervasive diversifying selection. It addresses the question: Which specific sites in a gene show evidence of positive selection that has been consistently maintained across the entire evolutionary phylogeny of the analyzed sequences?
The phenomenon of pervasive selection is generally most prevalent in pathogen evolution and any biological system influenced by evolutionary arms race dynamics (or balancing selection), including adaptive immune escape by viruses. As such, FEL is ideally suited to identify sites under positive selection which represent candidate sites subject to strong selective pressures across the entire phylogeny.
FEL is our recommended method for analyzing small-to-medium size datasets when one wishes only to study pervasive selection at individual sites.
FEL (Fixed Effects Likelihood) is a powerful method for detecting pervasive positive or negative selection at individual sites in a coding sequence. It operates by estimating site-wise synonymous (alpha, dS) and non-synonymous (beta, dN) substitution rates using a maximum likelihood approach. For each site, FEL then performs a likelihood ratio test (LRT) to compare a null model (where dN = dS) against an alternative model (where dN != dS). A significant p-value from this test indicates that the site is under selection. The method aggregates information across all branches of the phylogenetic tree, making it suitable for identifying sites under pervasive diversifying selection (dN > dS) or pervasive purifying selection (dN < dS). While primarily designed for pervasive selection, FEL can also infer an additional nuisance parameter for the non-synonymous rate on branches not selected for testing, allowing for analysis of a subset of branches.
Intuition: Imagine you're looking at a gene's evolution across different species. Some parts of the gene might change a lot (diversifying selection), while others stay the same (purifying selection). FEL helps pinpoint the exact "letters" (sites) in the gene that are consistently under pressure to change or stay the same throughout its evolutionary history. It does this by comparing how often synonymous (silent) changes happen versus non-synonymous (amino acid altering) changes at each site. If non-synonymous changes happen significantly more often, it suggests positive selection.
Note: the names of sequences in the alignment must match the names of the sequences in the tree.
A JSON file with analysis results (http://hyphy.org/resources/json-fields.pdf). A custom visualization module for viewing these results is available (see http://vision.hyphy.org/FEL for an example)
--alignment [required] An in-frame codon alignment in one of the formats supported by HyPhy.
--tree [conditionally required] A phylogenetic tree (optionally annotated with {}).
--code Which genetic code to use (see tool form for available options).
--multiple-hits Include support for multiple nucleotide substitutions.
Double : Include branch-specific rates for double nucleotide substitutions.
Double+Triple : Include branch-specific rates for double and triple nucleotide substitutions.
None [default] : Use standard models which permit only single nucleotide changes to occur instantly.
--site-multihit Estimate multiple hit rates for each site. This option is available only if 'Include support for multiple nucleotide substitutions' is set to 'Double' or 'Double+Triple'.
Estimate [default] : Estimate multiple hit rates.
No : Do not estimate multiple hit rates.
--branches Which branches should be tested for selection?
All [default] : test all branches.
Internal : test only internal branches (suitable for intra-host pathogen evolution for example, where terminal branches may contain polymorphism data).
Leaves: test only terminal (leaf) branches.
Unlabeled: if the Newick string is labeled using the {} notation, test only branches without explicit labels (see http://hyphy.org/tutorials/phylotree/).
Custom : Enter a branch label.
--pvalue The significance level used to determine significance (default: 0.1, range: 0 to 1).
--srv Include site-to-site synonymous rate variation?
Yes [default] : Allow synonymous rates to vary from site to site.
No : Do not allow synonymous rates to vary.
--ci Compute profile likelihood confidence intervals for each variable site (default: No).
--resample Perform parametric bootstrap resampling to derive site-level null LRT distributions.
Warning: This will result in a significantly slower analysis. A value of 0 means no resampling is performed. This parameter specifies the maximum number of replicates per site (default: 0, range: 0 to 1000).
--restrict-sites Restrict FEL analysis to a subset of sites. If Yes, allows specifying a subset of sites for analysis.
Yes : Restrict analysis to a subset of sites.
No [default] : Do not restrict analysis to a subset of sites.
--limit-to-sites Only analyze sites whose 1-based indices match the following list (null to skip). This option is available only if 'Restrict FEL analysis to a subset of sites' is set to 'Yes'. Comma-separated list of site indices.
--save-lf-for-sites For sites whose 1-based indices match the following list, write out likelihood function snapshots (empty string to skip). This option is available only if 'Restrict FEL analysis to a subset of sites' is set to 'Yes'. Comma-separated list of site indices.
--precision Optimization precision settings for preliminary fits.
Standard [default]
Reduced for faster fitting
--kill-zero-lengths Automatically delete internal zero-length branches for computational efficiency.
Yes [default] : Automatically delete internal zero-length branches for computational efficiency (will not affect results otherwise).
Constrain : Keep zero-length branches, but constrain their values to 0.
No : Keep all branches.
--full-model Perform branch length re-optimization under the full codon model (default: Yes). If true, re-optimizes branch lengths under the full codon model.
;