Galaxy | Tool Preview

SnpEff download: (version 4.3+T.galaxy2)
The list of available databases can be obtained with 'SnpEff databases' tool

What it does

This tool downloads a specified database from https://sourceforge.net/projects/snpeff/files/databases/v4_3/. It deposits it into the history.


The usage scenario

Suppose you want to annoate a VCF file containing variants within mm10 version of the Mouse genome. To do this you can:

  1. Download mm10 snpEff database by typing mm10 into Select the annotation database... text box.
  2. Use SnpEff eff by choosing the downloaded database from the history using Downloaded snpEff database in your history option of the Genome source parameter.

Using SnpEff in Galaxy: A few points to remember

SnpEff relies on specially formatted databases to generate annotations. It will not work without them. There are several ways in which these databases can be obtained.

Pre-cached databases

Many standard (e.g., human, mouse, Drosophila) databases are likely pre-cached within a given Galaxy instance. You should be able to see them listed in Genome drop-down of SnpEff eff tool.

In you do not see them keep reading...

Download pre-built databases

SnpEff project generates large numbers of pre-build databases. These are available at https://sourceforge.net/projects/snpeff/files/databases/v4_3/ and can downloaded. Follow these steps:

  1. Use SnpEff databases tool to generate a list of existing databases. Note the name of the database you need.
  2. Use SnpEff download tool to download the database.
  3. Finally, use SnpEff eff by choosing the downloaded database from the history using Downloaded snpEff database in your history option of the Genome source parameter.

Alternatively, you can specify the name of the database directly in SnpEff eff using the Download on demand option (again, Genome source parameter). In this case snpEff will download the database before performing annotation.

Create your own database

In cases when you are dealing with bacterial or viral (or, frankly, any other) genomes it may be easier to create database yourself. For this you need:

  1. Download Genbank record corresponding to your genome of interest from NCBI or use annotations in GFF format accompanied by the corresponding genome in FASTA format.
  2. Use SnpEff build to create the database.
  3. Use the database in SnpEff eff (using Custom option for Genome source parameter).

Creating custom database has one major advantage. It guaranteess that you will not have any issues related to reference sequence naming -- the most common source of SnpEff errors.


To learn more about snpEff read its manual at http://snpeff.sourceforge.net/SnpEff_manual.html