Galaxy | Tool Preview

AnnData Operations (version 1.9.3+galaxy1)
Legacy 'h5' datatypes are useful for interacting with older tools. Choosing 'No packaged matrix output' is useful for cases where auxiliary files are generated (like in marker genes). Do not use when there are no other output files, as the tool will generate no output in Galaxy then.
A tabular file with headers, where the first column contains cell barcodes. Will be merged via a left join, so not all cells in the obs need to be in the metadata. Currently duplicated column headers will be ignored and the originals in the AnnData will be kept.
If activated, it will do 'adata.raw = adata'
Change field names in AnnData observations
Change field names in AnnData observations 0
Change field names in AnnData vars
Change field names in AnnData var 0
Field inside var.params where the gene symbols are, normally 'index' or 'gene_symbols'
Flag genes that start with these names
Flag genes that start with these names 0
to calculate percentage of the flagged genes in that number of top genes. Used by sc.pp.calculate_qc_metrics (integer).
Field inside var or obs to be made unique by appending a suffix (useful for gene symbols in var). A new field will be added with the '_u' suffix. It happens after all the above operations.
This might be relevant for interfacing with newer versions of AnnData, that might complain if .raw includes a varm null object.
Split the AnnData object into multiple AnnData objects based on the values of a given obs key. This is useful for example to split a dataset based on a cluster annotation.

Operations on AnnData objects

Performs the following operations:

  • Change observation/var fields, mostly for downstreaming processes convenience. Multiple fields can be changed at once.
  • Flag genes that start with a certain text: useful for flagging mitochondrial, spikes or other groups of genes.
  • For the flags created, calculates qc metrics (pct_<flag>_counts).
  • Calculates n_genes, n_counts for cells and n_cells, n_counts for genes.
  • For top <N> genes specified, calculate qc metrics (pct_counts_in_top_<N>_genes).
  • Make a specified column of var or obs unique (normally useful for gene symbols).
  • Copy from a set of compatible AnnData objects (same cells and genes): * Observations, such as clustering results. * Embeddings, such as tSNE or UMAPs. * Unstructure annotations, like gene markers.

This functionality will probably be added in the future to a larger package.

History

1.9.5+galaxy1: Makes cell metadata optional for workflow optional steps.

1.8.1+galaxy10: Adds field to be made unique in obs or var.

1.6.0+galaxy0: Moves to Scanpy Scripts 0.3.0 (Scanpy 1.6.0), versioning switched to track Scanpy as other tools.

0.0.3+galaxy0: Adds ability to merge AnnData objects (Scanpy 1.4.3).