Galaxy | Tool Preview

Alevin (version 1.10.1+galaxy2)
Built-ins were indexed using default options
Salmon indices
Salmon index 0
CB+UMI raw sequence file(s)
Read-sequence file(s)
--libtype
In cases where single-cell protocol supports variable length cellbarcodes, alevin adds nucleotide padding to make the lengths uniform. Furthermore, the padding scheme ensures that there are no collisions added in the process.
Tsv with no header, containing two columns mapping each transcript present in the reference to the corresponding gene (the first column is a transcript and the second is the corresponding gene).
Advanced options
Advanced options 0

Salmon is a lightweight method for quantifying transcript abundance from RNA–seq reads, combining a dual-phase parallel inference algorithm and feature-rich bias models with an ultra-fast read mapping procedure.

The salmon package contains 4 tools:

  • Index: creates a salmon index
  • Quant: quantifies a sample (Reads or mapping-based)
  • Alevin: Single-cell analysis
  • Quantmerge: Merges multiple quantifications into a single file

Galaxy divides these four into three separate tools in the IUC toolshed:

  • Salmon quant
  • Salmon quantmerge
  • Alevin

Alevin is a tool — integrated with the salmon software — that introduces a family of algorithms for quantification and analysis of 3’ tagged-end single-cell sequencing data. Currently alevin supports the following two major droplet based single-cell protocols:

  • Drop-seq
  • 10x-Chromium v1/2/3

Alevin works under the same indexing scheme (as salmon) for the reference, and consumes the set of FASTA/Q files(s) containing the Cellular Barcode(CB) + Unique Molecule identifier (UMI) in one read file and the read sequence in the other. Given just the transcriptome and the raw read files, alevin generates a cell-by-gene count matrix (in a fraction of the time compared to other tools).

Alevin works in two phases. In the first phase it quickly parses the read file containing the CB and UMI information to generate the frequency distribution of all the observed CBs, and creates a lightweight data-structure for fast-look up and correction of the CB. In the second round, alevin utilizes the read-sequences contained in the files to map the reads to the transcriptome, identify potential PCR/sequencing errors in the UMIs, and performs hybrid de-duplication while accounting for UMI collisions. Finally, a post-abundance estimation CB whitelisting procedure is done and a cell-by-gene count matrix is generated.

For further information regarding the tool and its optional parameters, visit the Alevin and Salmon wikis.