A modular tool to aggregate results from bioinformatics analyses across many samples into a single report.
Report generated on 2016-05-23, 10:05 based on data in:
/root/galaxy/database/jobs_directory/000/107/working/multiqc_WDir
General Statistics
Showing 13 rows.Sample Name | % Assigned | M Assigned | % Dups | Insert Size | Fold Enrichment | Target Bases 30X | CCG Oxidation | M Total seqs | M Reads Mapped | M Non-Primary Alignments | Error rate | % mCpG | % mCHG | % mCHH | M C's | % Dups | M Unique | M Aligned | % Aligned | % Aligned | M Aligned | % Trimmed | % Dups | % GC | Length | M Seqs |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
70__TopHat_on_data_1,_data_4,_and_data_3__accepted_hits | 70.8% | 0.3 | ||||||||||||||||||||||||
75__TopHat_on_data_1,_data_6,_and_data_5__accepted_hits | 69.6% | 0.4 | ||||||||||||||||||||||||
80__TopHat_on_data_1,_data_8,_and_data_7__accepted_hits | 71.8% | 0.4 | ||||||||||||||||||||||||
85__TopHat_on_data_1,_data_10,_and_data_9__accepted_hits | 72.0% | 0.4 | ||||||||||||||||||||||||
90__TopHat_on_data_1,_data_12,_and_data_11__accepted_hits | 71.3% | 0.4 | ||||||||||||||||||||||||
95__TopHat_on_data_1,_data_14,_and_data_13__accepted_hits | 70.7% | 0.5 | ||||||||||||||||||||||||
Bismark Report_SE_report | 0.2 | 69.7% | ||||||||||||||||||||||||
Stats on data 1 and data 95 | 0.6 | 0.6 | 0.0 | 0.42% | ||||||||||||||||||||||
TopHat on data 1, data 14, and data 13: align_summary | 99.5% | 0.3 | ||||||||||||||||||||||||
dataset_114.dat] | 0.6% | |||||||||||||||||||||||||
dataset_197.dat | 176 | |||||||||||||||||||||||||
poulet5_1 | 36.3% | 48% | 101 | 0.3 | ||||||||||||||||||||||
poulet5_2 | 36.2% | 48% | 101 | 0.3 |
featureCounts
Subread featureCounts is a highly efficient general-purpose read summarization program that counts mapped reads for genomic features such as genes, exons, promoter, gene bodies, genomic bins and chromosomal locations.
Picard
Picard is a set of Java command line tools for manipulating high-throughput sequencing data.
Mark Duplicates
Insert Size
Plot shows the number of reads at a given insert size. Reads with different orientations are summed.
GC Coverage Bias
This plot shows bias in coverage across regions of the genome with varying GC content. A perfect library would be a flat line at y = 1
.
Samtools
Samtools is a suite of programs for interacting with high-throughput sequencing data. This module parses the output from samtools stats
.
Reads Mapping
Bases Mapping
Read Pairs
Bismark
Bismark is a tool to map bisulfite converted sequence reads and determine cytosine methylation states.
Alignment Rates
Strand Alignment
Tophat
Tophat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes.
Cutadapt
Cutadapt is a tool to find and remove adapter sequences, primers, poly-Atails and other types of unwanted sequence from your high-throughput sequencing reads.
This plot shows the number of reads with certain lengths of adapter trimmed. Obs/Exp shows the raw counts divided by the number expected due to sequencing errors. A defined peak may be related to adapter length. See the cutadapt documentation for more information on how these numbers are generated.
FastQC
FastQC is a quality control tool for high throughput sequence data, written by Simon Andrews at the Babraham Institute in Cambridge.
Sequence Quality Histograms
The mean quality value across each base position in the read. See the FastQC help.
Per Sequence Quality Scores
The number of reads with average quality scores. Shows if a subset of reads has poor quality. See the FastQC help.
Per Base Sequence Content
The proportion of each base position for which each of the four normal DNA bases has been called. See the FastQC help.
Click a heatmap row to see a line plot for that dataset.
rollover for sample name
Per Sequence GC Content
The average GC content of reads. Normal random library typically have a roughly normal distribution of GC content. See the FastQC help.
Per Base N Content
The percentage of base calls at each position for which an N was called. See the FastQC help.
Sequence Length Distribution
All samples have sequences of exactly 101 bp in length.
Sequence Duplication Levels
The relative level of duplication found for every sequence. See the FastQC help.
Adapter Content
The cumulative percentage count of the proportion of your library which has seen each of the adapter sequences at each position. See the FastQC help. Only samples with ≥ 0.1% adapter contamination are shown.