Galaxy | Tool Preview

HUMAnN (version 3.9+galaxy0)
Paired-end Fasta/FastQ files should be merged first
Prescreen / Identifying community species
Prescreen / Identifying community species 0
Nucleotide search / Mapping reads to community pangenomes
Nucleotide search / Mapping reads to community pangenomes 0
Translated search / Aligning unmapped reads to a protein databases
Translated search / Aligning unmapped reads to a protein database 0
Gene and pathway quantifications
Gene and pathway quantification 0
Outputs
Outputs 0

What it does

HUMAnN is a pipeline for efficiently and accuretly profiling the presence/absence and abundance of microbial pathways in a community from metagenomic or metatranscriptomic sequencing data.

Read more about the tool: http://huttenhower.sph.harvard.edu/humann

This tool corresponds to the main tool in HUMAnN pipeline:

  1. Taxomonic prescreen

    Reads are mapped (with MetaPhlAn) to clade-specific marker genes to rapidly identify community species

  2. Pangenome search (nucleotide search)

    Reads are mapped (with Bowtie2) to pangenomes of identified species

  3. Translated search

    Unclassified reads are aligned to a comprehensive and non-redundant protein database

  4. Gene family and pathway quantification

    • Gene abundance estimation

      Mapping results are processed to estimate per-species and community total gene family abundance, weighting by

      • alignment Quality
      • gene length
      • gene coverage
    • Per-species and community-level metabolic network reconstruction

      Genes are mapped to metabolic reactions to identify a parsiomonious set of pathways that explains each species' observed reactions

      Pathway abundance and coverage are quantified by:

      1. optimizing over alternative subpathways
      2. imputing abundance for conspicuously depleted reactions

Inputs

HUMAnN can start from a few different types of input data each in a few different types of formats:

  • Quality-controlled shotgun sequencing reads

    This is the most common starting point : A metagenome (DNA reads) or metatranscriptome (RNA reads)

  • Pre-computed mappings of reads to database sequences

  • Pre-computed (typically gene) abundance tables

HUMAnN uses 3 reference databases Locally cached databases have to be downloaded before using them (using the dedicated tool). Custom databases can also be used after upload.

Outputs

HUMAnN creates three output files:

  • Gene families and their abundance
  • Pathways and their abundance
  • Pathways and their coverage

Ten intermediate temp output files can also be retrieved.