Galaxy | Tool Preview

SOPRA with prebuilt contigs (version 0.1)
Contigs files
Contigs file 0
Paired-end Illumina libraries
Paired-end Illumina library 0
May be 0, 1, 2, or 3
High coverage contigs (above mean_coverage + h x std_coverage) are not considered in the scaffold assembly mainly to exclude reads from repetitive regions

What it does

SOPRA is a scaffold assembly tool for paired-end/mate pair data generated by high-throughput sequencing technologies, e.g. Illumina and SOLiD platforms. This wrapper currently supports only Illumina paired-end data.

Bowtie is used to align the reads to the contigs.

The input paired-end FASTA file can be obtained with: FR reads -> FASTQ interlacer on paired end reads followed by FASTQ to FASTA converter RF reads -> Reverse-Complement, FASTQ interlacer on paired end reads followed by FASTQ to FASTA converter

TIP: Try trimming the end of short reads before feeding it to the assembler to remove the error prone bases (e.g. last 10 to 20 bps) and check if it improves the assembly.


License and citation

This Galaxy tool is Copyright © 2013 CRS4 Srl. and is released under the MIT license.

If you use this tool in Galaxy, please cite Cuccuru, G., Orsini, M., Pinna, A., Sbardellati, A., Soranzo, N., Travaglione, A., Uva, P., Zanetti, G., Fotia, G. (2013) Orione, a web-based framework for NGS analysis in microbiology. Submitted.

This tool uses SOPRA, which is licensed separately. Please cite Dayarian, A., Michael, T. P., Sengupta, A. M. (2010) SOPRA: Scaffolding algorithm for paired reads via statistical optimization. BMC Bioinformatics 11, 345.