Galaxy | Tool Preview

EBI SCXA Data Retrieval (version v0.0.2+galaxy2)
EBI Single Cell Atlas accession for the experiment that you want to retrieve.
Raw filtered counts or (non-filtered) TPMs

Gene expression analysis in single cells across species and biological conditions

Single Cell Expression Atlas supports research in single cell transcriptomics. The Atlas annotates publicly available single cell RNA-Seq experiments with ontology identifiers and re-analyses them using standardised pipelines available through iRAP, our RNA-Seq analysis toolkit. The browser enables visualisation of clusters of cells, their annotations and supports searches for gene expression within and across studies.

For more information check https://www.ebi.ac.uk/gxa/sc/home

EBI SCXA Data Retrieval

The data retrieval tool presented here allows the user to retrieve expression matrices and metadata for any public experiment available at EBI Single Cell Expression Atlas.

To use it, simply set the accession for the desired experiment and choose the type of matrix that you want to download:

Raw filtered counts:
 This should be the default choice for running clustering and another analysis methods where you will introduce scaling and normalization of the data. The filtering is based on the quality control applied by iRAP prior to pseudo-alignment and quantification.
TPMs:TPM stands for Transcripts Per Kilobase Million, and as the name implies, this has been already normalized/scaled. You should keep this in mind when using this data on methods that will try to normalise data as part of their procedure. Due to technical particularities in the current Atlas SC pipeline, TPMs available here are not filtered. Note: droplet databases won't have TPM data

Outputs will be:

Matrix (txt):Contains the expression values for genes (rows) and samples/runs/cells (columns), in either raw filtered counts or filtered tpms depending on the choice made. This text file is formatted as a Matrix Market file, and as such it is accompanied by separate files for the gene identifiers and the samples/runs/cells identifiers.
Genes (tsv):Identifiers (column repeated) for the genes present in the matrix of expression, in the same order as the matrix rows.
Barcodes (tsv):Identifiers for the cells, samples or runs of the data matrix. The file is ordered to match the columns of the matrix.
Experiment Design file (tsv):
 Contains metadata for the different cells/samples/runs of the experiment. Please note that this file is generated before the filtering step, and while not often, it might be the case that it contains more cells/samples/runs than the matrix.