Galaxy | Tool Preview

HyPhy-BGM (version 2.5.47+galaxy0)
If the input file type is NEXUS and it includes a valid newick tree, that tree will override an uploaded newick tree

BGM : Bayesian Graphical Models

What does this do?

This tools identifies groups of sites in the alignments that experience substitutions along the same branches, i.g. co-evolve.

Brief description

GM (Bayesian Graphical Model) uses a maximum likelihood ancestral state reconstruction to map substitution (non-synonymous only for coding data) events to branches in the phylogeny and then analyzes the joint distribution of the substitution map using a Bayesian graphical model (network). Next, a Markov chain Monte Carlo analysis is used to generate a random sample of network structures from the posterior distribution given the data. Each node in the network represents a site in the alignment, and links (edges) between nodes indicate high posterior support for correlated substitutions at the two sites over time, which implies coevolution.

Input

  1. A FASTA sequence alignment.
  2. A phylogenetic tree in the Newick format

Note: the names of sequences in the alignment must match the names of the sequences in the tree.

Output

A JSON file with analysis results (http://hyphy.org/resources/json-fields.pdf).

A custom visualization module for viewing these results is available (see http://vision.hyphy.org/BGM for an example)

Tool options

--branches          Which branches should be tested for selection?
                        All [default] : test all branches

                        Internal : test only internal branches (suitable for
                        intra-host pathogen evolution for example, where terminal branches
                        may contain polymorphism data)

                        Leaves: test only terminal (leaf) branches

                        Unlabeled: if the Newick string is labeled using the {} notation,
                        test only branches without explicit labels
                        (see http://hyphy.org/tutorials/phylotree/)

--max-parents      The maximum number of parents allowed per node, i.e. how many sites
                   can directly influence substitution patterns at another site
                   Increasing this number scales complexity nonlinearly
                        default value: 1

--min-subs         The minium number of substitutions per site to include it in the analysis
                   Filter low complexity (too few substitution) sites
                         default value: 1

--chains           How many MCMC chains to run (does not apply to Variational-Bayes)
                        default value: 5

--steps            MCMC chain length (does not apply to Variational-Bayes)
                        default value: 100,000

--burn-in          MCMC chain burn in (does not apply to Variational-Bayes)
                        default value: 10,000

--samples          MCMC samples to draw (does not apply to Variational-Bayes)
                        default value: 100