A modular tool to aggregate results from bioinformatics analyses across many samples into a single report.
Report generated on 2017-05-24, 21:48 based on data in:
/tmp/tmpWNBij6/job_working_directory/000/46/working/multiqc_WDir
General Statistics
Showing 15/15 rows and 17/21 columns.Sample Name | % Assigned | M Assigned | % Dups | Insert Size | Error rate | M Non-Primary | M Reads Mapped | % Mapped | M Total seqs | M Reads Mapped | % Aligned | % Aligned | M Aligned | % Trimmed | % Dups | % GC | M Seqs |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
70: TopHat on data 1, data 4, and data 3: accepted_hits | 70.8% | 0.3 | |||||||||||||||
75: TopHat on data 1, data 6, and data 5: accepted_hits | 69.6% | 0.4 | |||||||||||||||
80: TopHat on data 1, data 8, and data 7: accepted_hits | 71.8% | 0.4 | |||||||||||||||
85: TopHat on data 1, data 10, and data 9: accepted_hits | 72.0% | 0.4 | |||||||||||||||
90: TopHat on data 1, data 12, and data 11: accepted_hits | 71.3% | 0.4 | |||||||||||||||
95: TopHat on data 1, data 14, and data 13: accepted_hits | 70.7% | 0.5 | |||||||||||||||
bismark_data | 69.7% | ||||||||||||||||
dataset_114 | 0.6% | ||||||||||||||||
dataset_197 | 176bp | ||||||||||||||||
dataset_33 | 10.8% | ||||||||||||||||
poulet5_1 | 36.3% | 48% | 0.3 | ||||||||||||||
poulet5_2 | 36.2% | 48% | 0.3 | ||||||||||||||
samtools_flagstat | 20.7 | ||||||||||||||||
samtools_stats | 0.42% | 0.0 | 0.6 | 100.0% | 0.6 | ||||||||||||
tophat_data | 99.5% | 0.3 |
featureCounts
Subread featureCounts is a highly efficient general-purpose read summarization program that counts mapped reads for genomic features such as genes, exons, promoter, gene bodies, genomic bins and chromosomal locations.
Picard
Picard is a set of Java command line tools for manipulating high-throughput sequencing data.
Mark Duplicates
Insert Size
Plot shows the number of reads at a given insert size. Reads with different orientations are summed.
GC Coverage Bias
This plot shows bias in coverage across regions of the genome with varying GC content. A perfect library would be a flat line at y = 1
.
Samtools
Samtools is a suite of programs for interacting with high-throughput sequencing data.
Percent Mapped
Alignment metrics from samtools stats
; mapped vs. unmapped reads.
Alignment metrics
This module parses the output from samtools stats
. All numbers in millions.
Samtools Flagstat
This module parses the output from samtools flagstat
. All numbers in millions.
Bismark
Bismark is a tool to map bisulfite converted sequence reads and determine cytosine methylation states.
Alignment Rates
Strand Alignment
Tophat
Tophat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes.
Cutadapt
Cutadapt is a tool to find and remove adapter sequences, primers, poly-Atails and other types of unwanted sequence from your high-throughput sequencing reads.
This plot shows the number of reads with certain lengths of adapter trimmed. Obs/Exp shows the raw counts divided by the number expected due to sequencing errors. A defined peak may be related to adapter length. See the cutadapt documentation for more information on how these numbers are generated.
FastQC
FastQC is a quality control tool for high throughput sequence data, written by Simon Andrews at the Babraham Institute in Cambridge.
Sequence Quality Histograms
The mean quality value across each base position in the read. See the FastQC help.
Per Sequence Quality Scores
The number of reads with average quality scores. Shows if a subset of reads has poor quality. See the FastQC help.
Per Base Sequence Content
The proportion of each base position for which each of the four normal DNA bases has been called. See the FastQC help.
Rollover for sample name
Per Sequence GC Content
The average GC content of reads. Normal random library typically have a roughly normal distribution of GC content. See the FastQC help.
Per Base N Content
The percentage of base calls at each position for which an N was called. See the FastQC help.
Sequence Length Distribution
Sequence Duplication Levels
The relative level of duplication found for every sequence. See the FastQC help.
Overrepresented sequences
The total amount of overrepresented sequences found in each library. See the FastQC help for further information.
Adapter Content
The cumulative percentage count of the proportion of your library which has seen each of the adapter sequences at each position. See the FastQC help. Only samples with ≥ 0.1% adapter contamination are shown.