Mercurial > repos > peterjc > coverage_stats
view tools/coverage_stats/coverage_stats.xml @ 1:d1fdfaae5dbe draft
Uploaded v0.0.1b with warning about samtools 8000 coverage limit
author | peterjc |
---|---|
date | Fri, 21 Nov 2014 09:43:58 -0500 |
parents | ca8f63f2f7d4 |
children | 7254ece0c0ff |
line wrap: on
line source
<tool id="coverage_stats" name="BAM coverage statistics" version="0.0.1"> <description>using samtools idxstats and depth</description> <requirements> <requirement type="binary">samtools</requirement> <requirement type="package" version="0.1.19">samtools</requirement> </requirements> <version_command interpreter="python">coverage_stats.py --version</version_command> <command interpreter="python">coverage_stats.py "$input_bam" "${input_bam.metadata.bam_index}" "$out_tabular"</command> <inputs> <param name="input_bam" type="data" format="bam" label="Input BAM file" /> </inputs> <outputs> <data name="out_tabular" format="tabular" label="$input_bam.name (coverage stats)" /> </outputs> <stdio> <!-- Assume anything other than zero is an error --> <exit_code range="1:" /> <exit_code range=":-1" /> </stdio> <tests> <test> <param name="input_bam" value="ex1.bam" ftype="bam" /> <output name="out_tabular" file="ex1.coverage_stats.tabular" ftype="tabular" /> </test> <test> <param name="input_bam" value="coverage_test.bam" ftype="bam" /> <output name="out_tabular" file="coverage_test.coverage_stats.tabular" ftype="tabular" /> </test> </tests> <help> **What it does** This tool runs the commands ``samtools idxstats`` and ``samtools depth`` from the SAMtools toolkit, and parses their output to produce a consise summary of the coverage information for each reference sequence. Input is a sorted and indexed BAM file, the output is tabular. The first four columns match the output from ``samtools idxstats``, the additional columns are calculated from the ``samtools depth`` output. The final row with a star as the reference identifier represents unmapped reads, and will have zeros in every column except columns one and four. ====== ================================================================================= Column Description ------ --------------------------------------------------------------------------------- 1 Reference sequence identifier 2 Reference sequence length 3 Number of mapped reads 4 Number of placed but unmapped reads (typically unmapped partners of mapped reads) 5 Minimum coverage 6 Maximum coverage 7 Mean coverage (given to 2 dp) ====== ================================================================================= Example output from a *de novo* assembly: ========== ====== ====== ====== ======= ======= ======== identiifer length mapped placed min_cov max_cov mean_cov ---------- ------ ------ ------ ------- ------- -------- contig_1 833604 436112 0 1 157 71.95 contig_2 14820 9954 0 1 152 91.27 contig_3 272099 142958 0 1 150 72.31 contig_4 135519 73288 0 1 149 75.23 contig_5 91245 46759 0 1 157 70.92 contig_6 175604 95744 0 1 146 75.99 contig_7 90586 48158 0 1 151 72.93 contig_9 234347 126458 0 1 159 75.40 contig_10 121515 60211 0 1 152 68.12 ... ... ... ... ... ... ... contig_604 712 85 0 1 49 21.97 \* 0 0 950320 0 0 0.00 ========== ====== ====== ====== ======= ======= ======== In this example there were 604 contigs, each with one line in the output table, plus the final row (labelled with an asterisk) representing 950320 unmapped reads. In this BAM file, the fourth column was otherwise zero. .. class:: warningmark **Note**. There is an internal hard limit of 8000 for the pileup routine in samtools, meaning the reported coverage from ``samtools depth`` will show maximum coverage depths *around* 8000. **Citation** If you use this Galaxy tool in work leading to a scientific publication please cite: Heng Li et al (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 25(16), 2078-9. http://dx.doi.org/10.1093/bioinformatics/btp352 Peter J.A. Cock (2013), BAM coverage statistics using samtools idxstats and depth. http://toolshed.g2.bx.psu.edu/view/peterjc/coverage_stats This wrapper is available to install into other Galaxy Instances via the Galaxy Tool Shed at http://toolshed.g2.bx.psu.edu/view/peterjc/coverage_stats </help> <citations> <citation type="doi">10.1093/bioinformatics/btp352</citation> </citations> </tool>