Galaxy |

Kodoja is a tool intended to identify viral sequences in a FASTQ/FASTA sequencing run by matching them against both Kraken and Kaiju databases.

The main output is a tab-separated table as follows (tabular format in Galaxy) with the following columns:

Species name
Species NCBI taxonomy identifier (TaxID)
Number of reads assigned by either Kraken or Kaiju to this species
Number of Reads assigned by both Kraken and Kaiju to this species
Genus name
Number of reads assigned by either Kraken or Kaiju to this genus
Number of reads assigned by both Kraken and Kaiju to this genus

The counts in columns 6 and 7 are for reads assigned to that genus, but not to any species within it.

For example,

Species	Species TaxID	Species sequences	Species sequences (stringent)	Genus	Genus sequences	Genus sequences (stringent)
Cassava brown streak virus	137758	45	45	Ipomovirus	0	0
Ugandan cassava brown streak virus	946046	28	28	Ipomovirus	0	0
Tobacco etch virus	12227	21	19	Potyvirus	0	0

The second most important output, which you can optionally capture for use within Galaxy, is a per-read table summarising matches found with Kraken and/or Kaiju. The Kodoja Retrieve tool is not currently available within Galaxy, but you can instead use this file directly within Galaxy to filter out just the virus reads, or even reads matched to a specific taxid. See for example seq_filter_by_id which is available via the Galaxy Tool Shed:

http://toolshed.g2.bx.psu.edu/view/peterjc/seq_filter_by_id https://github.com/peterjc/pico_galaxy/tree/master/tools/seq_filter_by_id

The Kodoja Search command line tool offers additional options not currently exposed in Galaxy, including:

                      Number of threads
-s, --host_subset     Subset host sequences before Kaiju
-m TRIM_MINLEN, --trim_minlen TRIM_MINLEN
                      Trimmomatic minimum length
-a TRIM_ADAPT, --trim_adapt TRIM_ADAPT
                      Illumina adapter sequence file
-q KRAKEN_QUICK, --kraken_quick KRAKEN_QUICK
                      Number of minium hits by Kraken
-p, --kraken_preload  Kraken preload database
-c KAIJU_SCORE, --kaiju_score KAIJU_SCORE
                      Kaju alignment score
-l KAIJU_MINLEN, --kaiju_minlen KAIJU_MINLEN
                      Kaju minimum length
-i KAIJU_MISMATCH, --kaiju_mismatch KAIJU_MISMATCH
                      Kaju allowed mismatches

For more information, please see the Kodoja manual https://github.com/abaizan/kodoja/wiki/Kodoja-Manual