Galaxy | Tool Preview

edgeR: Differential Gene(Expression) Analysis (version 3.11.0.b)
Only letters, numbers and underscores will be retained in this field
Only letters, numbers and underscores will be retained in this field
Used to highlight significant genes in figures

edgeR: Differential Gene(Expression) Analysis

Overview

Differential expression analysis of RNA-seq and digital gene expression profiles with biological replication. Uses empirical Bayes estimation and exact tests based on the negative binomial distribution. Also useful for differential signal analysis with other types of genome-scale count data [1].

For every experiment, the algorithm requires a design matrix. This matrix describes which samples belong to which groups. More details on this are given in the edgeR manual: http://www.bioconductor.org/packages/2.12/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf and the limma manual.

Because the creation of a design matrix can be complex and time consuming, especially if no GUI is used, this package comes with an alternative tool which can help you with it. This tool is called edgeR Design Matrix Creator. If the appropriate design matrix (with corresponding links to the files) is given, the correct contrast ( http://en.wikipedia.org/wiki/Contrast_(statistics) ) has to be given.

If you have for example two groups, with an equal weight, you would like to compare either "g1-g2" or "normal-cancer".

The test function makes use of a MCF7 dataset used in a study that indicates that a higher sequencing depth is not neccesairily more important than a higher amount of replaciates[2].

Input

Expression matrix

Geneid  "\t" Sample-1 "\t" Sample-2 "\t" Sample-3 "\t" Sample-4 [...] "\n"
SMURF   "\t"      123 "\t"       21 "\t"    34545 "\t"       98  ...  "\n"
BRCA1   "\t"      435 "\t"     6655 "\t"       45 "\t"       55  ...  "\n"
LINK33  "\t"        4 "\t"      645 "\t"      345 "\t"        1  ...  "\n"
SNORD78 "\t"      498 "\t"       65 "\t"       98 "\t"       27  ...  "\n"
[...]

Note: Make sure the number of columns in the header is identical to the number of columns in the body.

Design matrix

Sample    "\t" Condition "\t" Ethnicity "\t" Patient "\t" Batch "\n"
Sample-1  "\t"     Tumor "\t"  European "\t"       1 "\t"     1 "\n"
Sample-2  "\t"    Normal "\t"  European "\t"       1 "\t"     1 "\n"
Sample-3  "\t"     Tumor "\t"  European "\t"       2 "\t"     1 "\n"
Sample-4  "\t"    Normal "\t"  European "\t"       2 "\t"     1 "\n"
Sample-5  "\t"     Tumor "\t"   African "\t"       3 "\t"     1 "\n"
Sample-6  "\t"    Normal "\t"   African "\t"       3 "\t"     1 "\n"
Sample-7  "\t"     Tumor "\t"   African "\t"       4 "\t"     2 "\n"
Sample-8  "\t"    Normal "\t"   African "\t"       4 "\t"     2 "\n"
Sample-9  "\t"     Tumor "\t"     Asian "\t"       5 "\t"     2 "\n"
Sample-10 "\t"    Normal "\t"     Asian "\t"       5 "\t"     2 "\n"
Sample-11 "\t"     Tumor "\t"     Asian "\t"       6 "\t"     2 "\n"
Sample-12 "\t"    Normal "\t"     Asian "\t"       6 "\t"     2 "\n"

Note: Avoid factor names that are (1) numerical, (2) contain mathematical symbols and preferebly only use letters.

Contrast

The contrast represents the biological question. There can be many questions asked, e.g.:

  • Tumor-Normal
  • African-European
  • 0.5*(Control+Placebo) / Treated

Contact

The tool wrapper has been written by Youri Hoogstrate from the Erasmus Medical Center (Rotterdam, Netherlands)

I would like to thank Hina Riaz - Naz Khan for her helpful contribution.