Galaxy | Tool Preview

Seurat FindVariableGenes (version 4.0.4+galaxy0)
Seurat RDS, Seurat H5, Single Cell Experiment RDS, Loom or AnnData
Select RDS file(s) with Seurat object for input
Seurat, Single Cell Experiment, AnnData or Loom
Number of features to return.
Function to compute x-axis value (average expression). Default is to take the mean of the detected (i.e. non-zero) values.
Function to compute y-axis value (dispersion). Default is to take the standard deviation of all values.
Bottom cutoff on x-axis (mean) for identifying variable genes.
Top cutoff on x-axis (mean) for identifying variable genes.
Bottom cutoff on y-axis (dispersion) for identifying variable genes.
Top cutoff on y-axis (dispersion) for identifying variable genes.
How to choose top variable features. Choose one of: 'vst', 'mvp', disp.

What it does

This tool identifies genes that are outliers on a 'mean variability plot'. First, uses a function to calculate average expression (mean.function) and dispersion (dispersion.function) for each gene. Next, divides genes into num.bin (deafult 20) bins based on their average expression, and calculates z-scores for dispersion within each bin. The purpose of this is to identify variable genes while controlling for the strong relationship between variability and average expression.

For the mean.var.plot method: Exact parameter settings may vary empirically from dataset to dataset, and based on visual inspection of the plot. Setting the y.cutoff parameter to 2 identifies features that are more than two standard deviations away from the average dispersion within a bin. The default X-axis function is the mean expression level, and for Y-axis it is the log(Variance/mean). All mean/variance calculations are not performed in log-space, but the results are reported in log-space - see relevant functions for exact details.

Seurat is a toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. It is developed and maintained by the Satija Lab at NYGC. Seurat aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data.


Inputs

  • Seurat RDS object
  • Mean function. Function to compute x-axis value (average expression). Default is to take the mean of the detected (i.e. non-zero) values.
  • Dispersion function. Function to compute y-axis value (dispersion). Default is to take the standard deviation of all values.
  • Bottom cutoff on x-axis for identifying variable genes.
  • Top cutoff on x-axis for identifying variable genes.
  • Bottom cutoff on y-axis for identifying variable genes.
  • Top cutoff on y-axis for identifying variable genes.

Outputs

Version history 4.0.0: Moves to Seurat 4.0.0, introducing a number of methods for merging datasets, plus the whole suite of Seurat plots. Pablo Moreno with funding from AstraZeneca.

3.2.3+galaxy0: Moves to Seurat 3.2.3 and introduce convert method, improving format interconversion support.

3.1.2_0.0.8: Update metadata parsing

3.1.1_0.0.7: Exposes perplexity and enables tab input.

3.1.1_0.0.6+galaxy0: Moved to Seurat 3.

Find clusters: removed dims-use, k-param, prune-snn.

2.3.1+galaxy0: Improved documentation and further exposition of all script's options. Pablo Moreno, Jonathan Manning and Ni Huang, Expression Atlas team https://www.ebi.ac.uk/gxa/home at EMBL-EBI https://www.ebi.ac.uk/. Parts obtained from wrappers from Christophe Antoniewski (GitHub drosofff) and Lea Bellenger (GitHub bellenger-l).

0.0.1: Initial contribution. Maria Doyle (GitHub mblue9).