Galaxy |

medaka consensus pipeline (version 1.7.2+galaxy1)

Select basecalls:

Select assembly:

The input assembly should be preprocessed with racon.

Select model:

For best results it is important to specify the correct model, according to the basecaller used. Medaka models are named to indicate i) the pore type, ii) the sequencing device (MinION or PromethION), iii) the basecaller variant, and iv) the basecaller version

Set inference batch size:

Don't fill gaps in consensus with draft sequence?:

Select output file(s):

'Draft To Consensus', 'Variants' and 'Polished regions in draft coordinates' are generated using the parameter -v.

What it does

medaka is a tool suite to create a consensus sequence from nanopore sequencing data.

This task is performed using neural networks applied from a pileup of individual sequencing reads against a draft assembly. It outperforms graph-based methods operating on basecalled data, and can be competitive with state-of-the-art signal-based methods, whilst being much faster.

The medaka_consensus pipeline performs assembly polishing via neural networks.

Input

An assembly in .fasta format and basecalls in .fasta or .fastq format are required. See Creating a Draft Assembly for a detailed example of one method of obtaining these.

Output

Consensus polished assembly (FASTA)
Consensus Probabilities (H5/HDF)
Calls To Draft (BAM)
Draft To Consensus (chain, TXT)
Variants: VCF of changes (VCF)
Polished: BED file of polished regions (BED)

Models

For best results it is important to specify the correct model, -m in the above, according to the basecaller used. Allowed values can be found by running medaka tools list_models.

Medaka models are named to indicate i) the pore type, ii) the sequencing device (MinION or PromethION), iii) the basecaller variant, and iv) the basecaller version, with the format:

{pore}_{device}_{caller variant}_{caller version}

For example the model named r941_min_fast_g303 should be used with data from MinION (or GridION) R9.4.1 flowcells using the fast Guppy basecaller version 3.0.3. By contrast the model r941_prom_hac_g303 should be used with PromethION data and the high accuracy basecaller (termed "hac" in Guppy configuration files). Where a version of Guppy has been used without an exactly corresponding medaka model, the medaka model with the highest version equal to or less than the guppy version should be selected.

References

More information are available in the manual and github.