Galaxy | Tool Preview

Build base quality distribution (version 1.0.2)
No dataset? Read tip below

To use this tool, your dataset needs to be in the Quality Score format. Click the pencil icon next to your dataset to set the datatype to Quality Score (see below for examples).


What it does

This tool takes Quality Files generated by Roche (454), Illumina (Solexa), or ABI SOLiD machines and builds a graph showing score distribution like the one below. Such graph allows you to perform initial evaluation of data quality in a single pass.


Examples of Quality Data

Roche (454) or ABI SOLiD data:

>seq1
23 33 34 25 28 28 28 32 23 34 27 4 28 28 31 21 28

Illumina (Solexa) data:

-40 -40 40 -40   -40 -40 -40 40

Output example

Quality scores are summarized as boxplot (Roche 454 FLX data):

/repository/static/images/a3beeb5448ce6142/short_reads_boxplot.png

where the X-axis is coordinate along the read and the Y-axis is quality score adjusted to comply with the Phred score metric. Units on the X-axis depend on whether your data comes from Roche (454) or Illumina (Solexa) and ABI SOLiD machines:

  • For Roche (454) X-axis (shown above) indicates relative position (in %) within reads as this technology produces reads of different lengths;
  • For Illumina (Solexa) and ABI SOLiD X-axis shows absolute position in nucleotides within reads.

Every box on the plot shows the following values:

   o     <---- Outliers
   o
  -+-    <---- Upper Extreme Value that is no more
   |           than box length away from the box
   |
+--+--+  <---- Upper Quartile
|     |
+-----+  <---- Median
|     |
+--+--+  <---- Lower Quartile
   |
   |
  -+-    <---- Lower Extreme Value that is no more
               than box length away from the box
   o     <---- Outlier