This tool is very much the counterpart of estimateReadFiltering, in that it can filter alignments based on a variety of desired criterion. While much of this can be done with samtools, this tool can additionally filter by fragment strand and length (e.g., for RNA-seq and ATAC-seq experiments, respectively). Finally, this program can produce BEDPE files, which can be used as input into MACS2 for peak calling, where the fragment ends have been optionally shifted.
The primary output is a BAM file with all alignments passing the desired criteria. Note that all unmapped reads are removed. Additionally, an optional text file can be produced with the following entries:
- Number of reads passing the filtering criteria
- Total number of initial reads
Instead of producing a filtered BAM file, a BEDPE file appropriate for use with MACS2 can be used, optionally with fragment ends shifted. This is useful in cases like ATAC-seq.
The --shift option can take either 2 or 4 integers. If two integers are given, then the first value shifts the left-most end of a fragment and the second the right-most end. Positive values shift to the right and negative values to the left. See below for how setting --shift to '-5 3' would shift a single fragment:
----> read 1 read 2 <---- ------------------------ fragment -------------------------------- shifted fragment
The same results will be produced if read 1 and read 2 are swapped. If, instead, the protocol is strand-specific, then the first set of integers in a pair would be applied to fragments where read 1 precedes read 2, and the second set to cases where read 2 precedes read 1. In this case, the first value in each pair is applied to the end of read 1 and the second to the end of read 2. For example, suppose "-5 3 -1 4" were given as the option to --shift. The -5 3 set would produce the following:
----> read 1 read 2 <---- ------------------------ fragment -------------------------------- shifted fragment
and the -1 4 set would produce the following:
----> read 2 read 1 <---- ------------------------ fragment --------------------- shifted fragment
As can be seen, such fragments are considered to be on the - strand, so negative values then shift to the left on its frame of reference (thus, to the right relative to the + strand).
For more information on the tools, please visit our help site.
For support or questions please post to Biostars. For bug reports and feature requests please open an issue on github.
This tool is developed by the Bioinformatics and Deep-Sequencing Unit at the Max Planck Institute for Immunobiology and Epigenetics.