Galaxy | Tool Preview

eggNOG Mapper (version 2.1.8+galaxy4)
Annotation options
Annotation options 0
Output Options
Output Options 0

eggnog-mapper

Overview

eggnog-mapper is a tool for fast functional annotation of novel sequences (genes or proteins) using precomputed eggNOG-based orthology assignments. Obvious examples include the annotation of novel genomes, transcriptomes or even metagenomic gene catalogs. The use of orthology predictions for functional annotation is considered more precise than traditional homology searches, as it avoids transferring annotations from paralogs (duplicate genes with a higher chance of being involved in functional divergence).

EggNOG-mapper is also available as a public online resource: http://beta-eggnogdb.embl.de/#/app/emapper.

Outputs

annotations

This file provides final annotations of each query. Tab-delimited columns in the file are:

  • query_name: query sequence name
  • seed_eggNOG_ortholog: best protein match in eggNOG
  • seed_ortholog_evalue: best protein match (e-value)
  • seed_ortholog_score: best protein match (bit-score)
  • predicted_taxonomic_group
  • predicted_protein_name: Predicted protein name for query sequences
  • GO_terms: Comma delimited list of predicted Gene Ontology terms
  • EC_number
  • KEGG_KO
  • KEGG_Pathway: Comma delimited list of predicted KEGG pathways
  • KEGG_Module
  • KEGG_Reaction
  • KEGG_rclass
  • BRITE
  • KEGG_TC
  • CAZy
  • BiGG_Reactions
  • Annotation_tax_scope: The taxonomic scope used to annotate this query sequence
  • Matching_OGs: Comma delimited list of matching eggNOG Orthologous Groups
  • best_OG|evalue|score: Best matching Orthologous Groups (deprecated, use smallest from eggnog OGs)
  • COG_functional_categories: COG functional category inferred from best matching OG
  • eggNOG_free_text_description

orthologs

This output is only created if the option --report_orthologs is checked. It provides the orthologs used for the annotation. It's a tab delimited file with the following columns:

  • query
  • orth_type Type of orthologs in this row. See --target_orthologs.
  • species
  • orthologs comma-separated list of orthologs (If an ortholog shows a "*", such ortholog was used to transfer its annotations to the query.)

**sequences without annotation **

This output is created if cached annotations are used as input. It is a FASTA file containing all sequences that are not found in the cached annotations. These sequences can then be used as input for another run of the EggNOG mapper computing seed orthologs with diamond, etc.

Recommentation for large input data

EggNOG-mapper consists of two phases

  1. finding seed orthologous sequences (compute intensive)
  2. expanding annotations (IO intensive)

by default (i.e. if Method to search seed orthologs is not Skip search stage... and Annotate seed orthologs is Yes) both phases are executed within one tool run.

For large input FASTA datasets in can be favourable to split this in two separate tool runs as follows:

  1. Split the FASTA (e.g. 1M seqs per data set)
  2. Run the search phase only (set Annotate seed orthologs to No) on the separate FASTA files.
  3. Run the annotation phase (set Method to search seed orthologs to Skip search stage...)

See [also](https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2.1.5-to-v2.1.8#Setting_up_large_annotation_jobs)

Another alternative is to use cached annotations (produced in a run with --md5 enabled).