Galaxy | Tool Preview

Kodoja is a tool intended to identify viral sequences in a FASTQ/FASTA sequencing run by matching them against both Kraken and Kaiju databases.

The main output is a tab-separated table as follows (tabular format in Galaxy) with the following columns:

  1. Species name
  2. Species NCBI taxonomy identifier (TaxID)
  3. Number of reads assigned by either Kraken or Kaiju to this species
  4. Number of Reads assigned by both Kraken and Kaiju to this species
  5. Genus name
  6. Number of reads assigned by either Kraken or Kaiju to this genus
  7. Number of reads assigned by both Kraken and Kaiju to this genus

The counts in columns 6 and 7 are for reads assigned to that genus, but not to any species within it.

For example,

Species Species TaxID Species sequences Species sequences (stringent) Genus Genus sequences Genus sequences (stringent)
Cassava brown streak virus 137758 45 45 Ipomovirus 0 0
Ugandan cassava brown streak virus 946046 28 28 Ipomovirus 0 0
Tobacco etch virus 12227 21 19 Potyvirus 0 0

The second most important output, which you can optionally capture for use within Galaxy, is a per-read table summarising matches found with Kraken and/or Kaiju. The Kodoja Retrieve tool is not currently available within Galaxy, but you can instead use this file directly within Galaxy to filter out just the virus reads, or even reads matched to a specific taxid. See for example seq_filter_by_id which is available via the Galaxy Tool Shed:

http://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id https://github.com/peterjc/pico_galaxy/tree/master/tools/seq_filter_by_id

The Kodoja Search command line tool offers additional options not currently exposed in Galaxy, including:

                      Number of threads
-s, --host_subset     Subset host sequences before Kaiju
-m TRIM_MINLEN, --trim_minlen TRIM_MINLEN
                      Trimmomatic minimum length
-a TRIM_ADAPT, --trim_adapt TRIM_ADAPT
                      Illumina adapter sequence file
-q KRAKEN_QUICK, --kraken_quick KRAKEN_QUICK
                      Number of minium hits by Kraken
-p, --kraken_preload  Kraken preload database
-c KAIJU_SCORE, --kaiju_score KAIJU_SCORE
                      Kaju alignment score
-l KAIJU_MINLEN, --kaiju_minlen KAIJU_MINLEN
                      Kaju minimum length
-i KAIJU_MISMATCH, --kaiju_mismatch KAIJU_MISMATCH
                      Kaju allowed mismatches

For more information, please see the Kodoja manual https://github.com/abaizan/kodoja/wiki/Kodoja-Manual