Galaxy |

topGO enrichment analysis (version 0.1.0)

Select your type of input file:

The identifiers must be Ensembl gene IDs (e.g : ENSG00000139618). If it is not the case, please use the ID Mapping tool.

Choose an input file:

This file must imperatively have 1 column filled with IDs consistent with the database that will be used. Please use the MappingIDs component if this is not the case.

Please specify the column where you would like to apply the comparison (e.g : Enter c1):

Does your file have a header?:

Select a specie:

Ontology category:

Choose the topGO option for your analysis:

Enter the p-value threshold level under the form 1e-level wanted (e.g : 1e-3):

Choose a correction for multiple testing:

Generate a text file for results:

Generate a barplot of over-represented GO terms:

Generate a dotplot of over-represented GO terms:

Galaxy component based on R package topGO.

Input required

This component works with Ensembl gene ids (e.g : ENSG0000013618). You can copy/paste these identifiers or supply a tabular file (.csv, .tsv, .txt, .tab) where there are contained.

Principle

This component provides the GO terms representativity of a gene list in one ontology category (Biological Process "BP", Cellular Component "CC", Molecular Function "MF"). This representativity is evaluated in comparison to the background list of all human genes associated associated with GO terms of the chosen category (BP,CC,MF). This background is given by the R package "org.Hs.eg.db", which is a genome wide association package for human.

Output

Three kind of outputs are available : a textual output, a barplot output and a dotplot output.

Textual output : The text output lists all the GO-terms that were found significant under the specified threshold.

The different fields are as follow :

Annotated : number of genes in org.Hs.eg.db which are annotated with the GO-term.
Significant : number of genes belonging to your input which are annotated with the GO-term.
Expected : show an estimate of the number of genes a node of size Annotated would have if the significant genes were to be randomly selected from the gene universe.
pvalues : pvalue obtained after the test
( qvalues : additional column with adjusted pvalues )

Tests

topGO provides a classic fisher test for evaluating if some GO terms are over-represented in your gene list, but other options are also provided (elim, weight01,parentchild). For the merits of each option and their algorithmic descriptions, please refer to topGO manual : https://bioconductor.org/packages/release/bioc/vignettes/topGO/inst/doc/topGO.pdf

Multiple testing corrections

Furthermore, the following corrections for multiple testing can also be applied : - holm - hochberg - hommel - bonferroni - BH - BY - fdr