Galaxy | Tool Preview

BamHash (version 1.1)
Do not use read quality as part of checksum. (--no-quality)
(--no-paired)
Do not use read names as part of checksum. (--no-readnames)

WHAT IT DOES

Hash BAM and FASTQ files to verify data integrity

For each pair of reads in a BAM or FASTQ file we compute a hash value composed of the readname, whether it is first or last in pair, sequence and quality value. All the hash values are summed up so the result is independent of the ordering within the files. The result can be compared to verify that the pair of FASTQ files contain the same read information as the aligned BAM file.


BAM

processes a number of BAM files. BAM files are assumed to contain paired end reads. If you run with --no-paired it treats all reads as single end and displays a warning if any read is marked as "second in pair" in the BAM file.


FASTA

processes a number of FASTA files. All FASTA files are assumed to be single end reads with no quality information. To compare to a BAM file, run bamhash_checksum_bam --no-paired --no-quality


FASTQ

processes a number of FASTQ files. FASTQ files are assumed to contain paired end reads, such that the first two files contain the first pair of reads, etc. If any of the read names in the two pairs don't match the program exits with failure.


BamHash is a Free and Open Source Software, see more details on the BamHash github Website.