Galaxy | Tool Preview

fgsea (version 1.8.0+galaxy1)
A tabular file with gene symbols in the first column, and a ranked statistic (e.g. t-statistic or log fold-change) in the second column
If this option is set to Yes, the tool will assume that the ranked genes file has a column header in the first row and the identifers commence on the second line. Default: Yes
A tabular file in GMT file or an RData file containing a list of gene sets, see below for more information
Minimal size of a gene set to test. All pathways below the threshold are excluded. Default: 1
Maximal size of a gene set to test. All pathways above the threshold are excluded. Default: 500
Number of permutations to do. Minimial possible nominal p-value is about 1/nperm. Default: 1000
Output a PDF file containing plots for top pathways by P value significance. Default: No
If Output plots is selected the number of top pathways to plot can be specified. Default: 10
Output all the data used by R in the fgsea analysis, can be loaded into R. Default: No

fgsea is a Bioconductor package for fast preranked gene set enrichment analysis (GSEA). The performance is achieved by using an algorithm for cumulative GSEA-statistic calculation. This allows to reuse samples between different gene set sizes. See the preprint for algorithmic details.


Inputs

Ranked Genes

A two-column file containing a ranked list of genes is required. The first column must contain the gene identifiers and the second column the statistic used to rank. Gene identifiers must be unique (not repeated) within the file and must be the same type as the identifiers in the Gene Sets file.

Example:

Symbol Ranked Stat
VDR 67.198
IL20RA 65.963
MPHOSPH10 51.353
RCAN1 50.269
HILPDA 50.015
TSC22D3 47.496
FAM107B 45.926

Gene Sets

A Gene Sets file is required. This can be a tabular file in Gene Matrix Transposed (GMT) format. In GMT format, each row represents a gene set, with the set name in the first column, a description in the second, then the identifiers of the genes in the set in the following columns, see the example below. GMT files with any identifiers (e.g. Entrez IDs, Symbols) can be used but the same type of identifiers must be present in the Ranked Genes file. More information on GMT format can be found at the Broad website. GMT files for human gene sets can be obtained from the Broad's MSigDB collections.

Example:
HALLMARK_APOPTOSIS http://www.broadinstitute.org/gsea/msigdb/cards/HALLMARK_APOPTOSIS CASP3 CASP9 ...
HALLMARK_HYPOXIA http://www.broadinstitute.org/gsea/msigdb/cards/HALLMARK_HYPOXIA PGK1 PDK1 ...

Alternatively, an RData file containing a collection of gene sets can be input, like the ones provided here containing mouse versions of the MSigDB collections.


Outputs


Wrapper released under MIT License. Copyright (c) 2017 Mark Dunning