What it does
BSMAP is a short reads mapping software for bisulfite sequencing reads. It has the following features:
- read length up to 144 nt, allow up to 15 mismatches, gap size up to 3 bp.
- support single end and pair end mapping. support multi-thread mapping.
- support both "Lister protocol" (sequence 2 forward strands only) and "Cokus protocol" (sequence all 4 bisulfite converted strands)
- reads are directly mapped to original reference genome sequence, no need to preprocess the reads and reference genome to convert C to T.
- support both whole genome bisulfite sequencing (WGBS) mode and reduced representation bisulfite sequencing (RRBS) mode, allow changing the digestion site information to support different digestion enzymes for RRBS.
- allow trimming adapter sequences and low quality nucleotides from the 3'end of reads
- allow trade off between speed/memory usage/mapping sensitivity. For human genome, the RRBS mode uses ~3GB. In WGBS mode, the typical memory usage is ~9GB, but can be as low as 5GB.
- allow alignment for other nucleotide transitions, for example, can be set to detect the A=>I(G) transition in RNA editing.
Input formats
BSMAP accepts files in FASTA/FASTQ format.
Outputs
The output contains the following files:
- mapped reads in SAM format
- mapping summary
- unpaired hits (only for paired-end mapping)