What it does
FastQC is a product of Bioinformatics Group at the Babraham Institute. FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis.
The main functions of FastQC are:
- Import of data from BAM, SAM or FastQ files (any variant) - Provding a quick overview to tell you in which areas there may be problems - Summary graphs and tables to quickly assess your data - Export of results to an HTML based permanent report - Offline operation to allow automated generation of reports without running the interactive application
Input format
Any fastq file, for example:
@HWI-EAS91_1_30788AAXX:7:21:1542:1758 GTCAATTGTACTGGTCAATACTAAAAGAATAGGATCGCTCCTAGCATCTGGAGTCTCTATCACCTGAGCCCA +HWI-EAS91_1_30788AAXX:7:21:1542:1758 hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh`hfhhVZSWehR
Contaminants format
An optional contaminant file (otherwise FastQC will use the default):
# This file contains a list of potential contaminants which are # frequently found in high throughput sequencing reactions. These # are mostly sequences of adapters / primers used in the various # sequencing chemistries. # # You can add more sequences to the file by putting one line per entry # and specifying a name[tab]sequence. If the contaminant you add is # likely to be of use to others please consider sending it to the FastQ # authors, either via a bug report at www.bioinformatics.bbsrc.ac.uk/bugzilla/ # or by directly emailing simon.andrews@bbsrc.ac.uk so other users of # the program can benefit. Illumina Single End Apapter 1 ACACTCTTTCCCTACACGACGCTGTTCCATCT Illumina Single End Apapter 2 CAAGCAGAAGACGGCATACGAGCTCTTCCGATCT Illumina Single End PCR Primer 1 AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT Illumina Single End PCR Primer 2 CAAGCAGAAGACGGCATACGAGCTCTTCCGATCT Illumina Single End Sequencing Primer ACACTCTTTCCCTACACGACGCTCTTCCGATCT
Outputs
An HTML file with links to:
- fastqc_report.html - summary.txt - fastqc_data.txt