correctGCBias (version 3.5.4+galaxy0)

Output of computeGCBias:

BAM/CRAM file:

This should be same file that was used for computeGCbias.

Reference genome:

Using reference genome:

If your genome of interest is not listed, contact the Galaxy team

Effective genome size:

The effective genome size is the portion of the genome that is mappable. Large fractions of the genome are stretches of NNNN that should be discarded. Also, if repetitive regions were not included in the mapping of reads, the effective genome size needs to be adjusted accordingly. We provide a table of useful sizes here: http://deeptools.readthedocs.io/en/latest/content/feature/effectiveGenomeSize.html

Region of the genome to limit the operation to:

This is useful when testing parameters to reduce the time required. The format is chr:start:end, for example "chr10" or "chr10:456700:891000".

What it does

This tool requires the output from computeGCBias to correct a given BAM file according to the method proposed in Benjamini and Speed (2012) Nucleic Acids Res. It will simply remove reads from regions with too high coverage compared to the expected values (typically GC-rich regions) and will add reads to regions where too few reads are seen (typically AT-rich regions). The resulting BAM file can be used in any downstream analyses, but be aware that you should not filter out duplicates from here on.

See the description of computeGCBias to read up on the details of the GC bias assessment and correction method.

Output files

correctGCbias only has one output: a BAM file where read densities have been changed to reflect the expected read distribution based on the genome.

Warning! The GC-corrected BAM file will most likely contain several duplicated reads in regions where the coverage had to increased in order to match the expected read density. This means that you should absolutely avoid using any filtering of duplicate reads during your downstream analyses!

For more information on the tools, please visit our help site.

For support or questions please post to Biostars. For bug reports and feature requests please open an issue on github.

This tool is developed by the Bioinformatics and Deep-Sequencing Unit at the Max Planck Institute for Immunobiology and Epigenetics.