Galaxy | Tool Preview

IDBA-UD (version 1.1.3+galaxy1)

IDBA is an iterative De Bruijn Graph De Novo Assembler for sequence assembly. Most assemblers based on de Bruijn graph build a de Bruijn graph with a specific k-mer size to perform the assembling task. For all of them, it is very crucial to find a specific value of k. If k is too large, there will be a lot of gap problems in the graph. If k is too small, there will a lot of branch problems. IDBA uses not only one specific k but a range of k values to build the iterative de Bruijn graph. It can keep all the information in graphs with different k values.

IDBA-UD is an extension of IDBA algorithm for Short Reads Sequencing data with Highly Uneven Sequencing Depth. IDBA-UD also iterates from small k to a large k. In each iteration, short and low-depth contigs are removed iteratively with cutoff threshold from low to high to reduce the errors in low-depth and high-depth regions. Paired-end reads are aligned to contigs and assembled locally to generate some missing k-mers in low-depth regions. With these technologies, IDBA-UD can iterate k value of de Bruijn graph to a very large value with less gaps and less branches to form long contigs in both low-depth and high-depth regions.

Input: IDBA-* take interleaved paired end data in the FASTA format as input, i.e. paired-end reads need to be stored in the same FASTA file such that a pair of reads should be in two consecutive lines. In Galaxy paired reads in separate FASTQ files can be converted into interleaved FASTA using the tools:

Note that, IDBA-* assumes that the paired-end reads are in order (->,<-). If your data is in reverse order (<-,->), please convert it by yourself.