Galaxy |

Seurat FindVariableGenes (version 4.0.4+galaxy0)

Choose the format of the input:

Seurat RDS, Seurat H5, Single Cell Experiment RDS, Loom or AnnData

RDS file:

Select RDS file(s) with Seurat object for input

Choose the format of the output:

Seurat, Single Cell Experiment, AnnData or Loom

Number of features:

Number of features to return.

Mean function:

Function to compute x-axis value (average expression). Default is to take the mean of the detected (i.e. non-zero) values.

Dispersion function:

Function to compute y-axis value (dispersion). Default is to take the standard deviation of all values.

X-axis low cutoff:

Bottom cutoff on x-axis (mean) for identifying variable genes.

X-axis high cutoff:

Top cutoff on x-axis (mean) for identifying variable genes.

Y-axis low cutoff:

Bottom cutoff on y-axis (dispersion) for identifying variable genes.

Y-axis high cutoff:

Top cutoff on y-axis (dispersion) for identifying variable genes.

Selection method:

How to choose top variable features. Choose one of: 'vst', 'mvp', disp.

What it does

This tool identifies genes that are outliers on a 'mean variability plot'. First, uses a function to calculate average expression (mean.function) and dispersion (dispersion.function) for each gene. Next, divides genes into num.bin (deafult 20) bins based on their average expression, and calculates z-scores for dispersion within each bin. The purpose of this is to identify variable genes while controlling for the strong relationship between variability and average expression.

For the mean.var.plot method: Exact parameter settings may vary empirically from dataset to dataset, and based on visual inspection of the plot. Setting the y.cutoff parameter to 2 identifies features that are more than two standard deviations away from the average dispersion within a bin. The default X-axis function is the mean expression level, and for Y-axis it is the log(Variance/mean). All mean/variance calculations are not performed in log-space, but the results are reported in log-space - see relevant functions for exact details.

Seurat is a toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. It is developed and maintained by the Satija Lab at NYGC. Seurat aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data.

Inputs

Seurat RDS object

Mean function. Function to compute x-axis value (average expression). Default is to take the mean of the detected (i.e. non-zero) values.

Dispersion function. Function to compute y-axis value (dispersion). Default is to take the standard deviation of all values.

Bottom cutoff on x-axis for identifying variable genes.

Top cutoff on x-axis for identifying variable genes.

Bottom cutoff on y-axis for identifying variable genes.

Top cutoff on y-axis for identifying variable genes.

Outputs

Seurat RDS object. Places variable genes in object@var.genes. The result of all analysis is stored in object@hvg.info

Tabular file of variable genes

Version history 4.0.0: Moves to Seurat 4.0.0, introducing a number of methods for merging datasets, plus the whole suite of Seurat plots. Pablo Moreno with funding from AstraZeneca.

3.2.3+galaxy0: Moves to Seurat 3.2.3 and introduce convert method, improving format interconversion support.

3.1.2_0.0.8: Update metadata parsing

3.1.1_0.0.7: Exposes perplexity and enables tab input.

3.1.1_0.0.6+galaxy0: Moved to Seurat 3.

Find clusters: removed dims-use, k-param, prune-snn.

2.3.1+galaxy0: Improved documentation and further exposition of all script's options. Pablo Moreno, Jonathan Manning and Ni Huang, Expression Atlas team https://www.ebi.ac.uk/gxa/home at EMBL-EBI https://www.ebi.ac.uk/. Parts obtained from wrappers from Christophe Antoniewski (GitHub drosofff) and Lea Bellenger (GitHub bellenger-l).

0.0.1: Initial contribution. Maria Doyle (GitHub mblue9).