What it does
This tool runs the samtools idxstats command in the SAMtools toolkit.
Input is a sorted and indexed BAM file, the output is tabular with four columns (one row per reference sequence plus a final line for unmapped reads):
Column | Description |
1 | Reference sequence identifier |
2 | Reference sequence length |
3 | Number of mapped reads |
4 | Number of placed but unmapped reads (typically unmapped partners of mapped reads) |
Example output from a de novo assembly:
contig_1 | 170035 | 98397 | 0 |
contig_2 | 403835 | 199564 | 0 |
contig_3 | 553102 | 288189 | 0 |
... | ... | ... | ... |
contig_603 | 653 | 50 | 0 |
contig_604 | 214 | 6 | 0 |
* | 0 | 0 | 50320 |
In this example there were 604 contigs, each with one line in the output table, plus the final row (labelled with an asterisk) representing 50320 unmapped reads. In this BAM file, the final column was otherwise zero.
Citation
If you use this Galaxy tool in work leading to a scientific publication please cite:
Heng Li et al (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 25(16), 2078-9. http://dx.doi.org/10.1093/bioinformatics/btp352
Peter J.A. Cock (2013), Galaxy wrapper for the samtools idxstats command http://toolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats
This wrapper is available to install into other Galaxy Instances via the Galaxy Tool Shed at http://toolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats