Galaxy | Tool Preview

FastQC (version 0.74+galaxy0)
tab delimited file with 2 columns: name and sequence. For example: Illumina Small RNA RT Primer CAAGCAGAAGACGGCATACGA
List of adapters adapter sequences which will be explicity searched against the library. It should be a tab-delimited file with 2 columns: name and sequence.
a file that specifies which submodules are to be executed (default=all) and also specifies the thresholds for the each submodules warning parameter
Using this option will cause fastqc to crash and burn if you use it on really long reads, and your plots may end up a ridiculous size. You have been warned!
As long as you set this to a value greater or equal to your longest read length then this will be the sequence length used to create your read groups. This can be useful for making directly comaparable statistics from datasets with somewhat variable read lengths.
Note: the Kmer test is disabled and needs to be enabled using a custom Submodule and limits file

Purpose

FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a set of analyses which you can use to get a quick impression of whether your data has any problems of which you should be aware before doing any further analysis.

The main functions of FastQC are:


FastQC

This is a Galaxy wrapper. It merely exposes the external package FastQC which is documented at FastQC Kindly acknowledge it as well as this tool if you use it. FastQC incorporates the Picard-tools libraries for SAM/BAM processing.

The contaminants file parameter was borrowed from the independently developed fastqcwrapper contributed to the Galaxy Community Tool Shed by J. Johnson. Adaption to version 0.11.2 by T. McGowan.


Inputs and outputs

FastQC is the best place to look for documentation - it's very good. A summary follows below for those in a tearing hurry.

This wrapper will accept a Galaxy fastq, fastq.gz, sam or bam as the input read file to check. It will also take an optional file containing a list of contaminants information, in the form of a tab-delimited file with 2 columns, name and sequence. As another option the tool takes a custom limits.txt file that allows setting the warning thresholds for the different modules and also specifies which modules to include in the output.

The tool produces a basic text and a HTML output file that contain all of the results, including the following:

All except Basic Statistics and Overrepresented sequences are plots.