Galaxy | Tool Preview

Human Cell Atlas Matrix Downloader (version v0.0.4+galaxy0)
HCA project identifier, can be project title, project label or project UUID.
Matrix Market or Loom
Certain studies will contain more than one species, in that case, one should be given as input. If you try to download data from such a study, the tool will fail and give you a list in the Std. out of what are the potential species to paste here. This is not needed for most studies that have a single species.

Down expression matrix from HCA projects using HCA DCP's matrix service API

The data retrieval tool presented here allows the user to retrieve expression matrices and metadata for any public experiment available at Human Cell Atlas data portal.

To use it, simply set the name, or label, or ID for the desired project, which can be found at the HCA data browser (https://data.humancellatlas.org/explore/projects), and select the desired matrix format (Matrix Market or Loom).

For projects that have more than one organism, one needs to be specified. If none is specified, then the job will fail and the available options to be specified will be listed in the stdout of the job.

Outputs will be:

  • When "Matrix Market" is seleted, outputs are in 10X-compatible Matrix Market format:

    1. Matrix (txt):

      Contains the expression values for genes (rows) and cells (columns) in raw counts. This text file is formatted as a Matrix Market file, and as such it is accompanied by separate files for the gene identifiers and the cells identifiers.

    2. Genes (tsv):

      Identifiers (column repeated) for the genes present in the matrix of expression, in the same order as the matrix rows.

    3. Barcodes (tsv):

      Identifiers for the cells of the data matrix. The file is ordered to match the columns of the matrix.

    4. Experiment Design file (tsv):

      Contains metadata for the different cells of the experiment.

  • When "Loom" is selected, output is a single Loom HDF5 file:

    1. Loom (h5):

      Contains expression values for genes (rows) and cells (columns) in raw counts, cell metadata table and gene metadata table, in a single HDF5 file with specification defined in http://linnarssonlab.org/loompy/format/index.html.

Version history

0.0.4+galaxy0: Retrieves data from EBI FTP until an equivalent Matrix service for DCP 2.0 is established. Deals with multi organisms studies.

0.0.2+galaxy0: Initial contribution. Ni Huang and Pablo Moreno, Teichmann Lab at Wellcome Sanger Institute and Expression Atlas team https://www.ebi.ac.uk/gxa/home at EMBL-EBI https://www.ebi.ac.uk/.