What it does
SOPRA is a scaffold assembly tool for paired-end/mate pair data generated by high-throughput sequencing technologies, e.g. Illumina and SOLiD platforms. This wrapper currently supports only Illumina paired-end data.
Bowtie is used to align the reads to the contigs.
The input paired-end FASTA file can be obtained with: FR reads -> FASTQ interlacer on paired end reads followed by FASTQ to FASTA converter RF reads -> Reverse-Complement, FASTQ interlacer on paired end reads followed by FASTQ to FASTA converter
TIP: Try trimming the end of short reads before feeding it to the assembler to remove the error prone bases (e.g. last 10 to 20 bps) and check if it improves the assembly.
License and citation
This Galaxy tool is Copyright © 2013 CRS4 Srl. and is released under the MIT license.
If you use this tool in Galaxy, please cite Cuccuru, G., Orsini, M., Pinna, A., Sbardellati, A., Soranzo, N., Travaglione, A., Uva, P., Zanetti, G., Fotia, G. (2013) Orione, a web-based framework for NGS analysis in microbiology. Submitted.
This tool uses SOPRA, which is licensed separately. Please cite Dayarian, A., Michael, T. P., Sengupta, A. M. (2010) SOPRA: Scaffolding algorithm for paired reads via statistical optimization. BMC Bioinformatics 11, 345.