Galaxy |

Alevin (version 1.10.1+galaxy2)

Select a reference transcriptome from your history or use a built-in index?:

Built-ins were indexed using default options

Salmon indices

Salmon index 0

Single or paired-end reads?:

Mate pair 1:

CB+UMI raw sequence file(s)

Mate pair 2:

Read-sequence file(s)

Specify the strandedness of the reads:

--libtype

Type of single-cell protocol:

In cases where single-cell protocol supports variable length cellbarcodes, alevin adds nucleotide padding to make the lengths uniform. Furthermore, the padding scheme ensures that there are no collisions added in the process.

Transcript to gene map file:

Tsv with no header, containing two columns mapping each transcript present in the reference to the corresponding gene (the first column is a transcript and the second is the corresponding gene).

Extra output files:

Advanced options

Advanced options 0

Salmon is a lightweight method for quantifying transcript abundance from RNA–seq reads, combining a dual-phase parallel inference algorithm and feature-rich bias models with an ultra-fast read mapping procedure.

The salmon package contains 4 tools:

Index: creates a salmon index

Quant: quantifies a sample (Reads or mapping-based)

Alevin: Single-cell analysis

Quantmerge: Merges multiple quantifications into a single file

Galaxy divides these four into three separate tools in the IUC toolshed:

Salmon quant

Salmon quantmerge

Alevin

Alevin is a tool — integrated with the salmon software — that introduces a family of algorithms for quantification and analysis of 3’ tagged-end single-cell sequencing data. Currently alevin supports the following two major droplet based single-cell protocols:

Drop-seq

10x-Chromium v1/2/3

Alevin works under the same indexing scheme (as salmon) for the reference, and consumes the set of FASTA/Q files(s) containing the Cellular Barcode(CB) + Unique Molecule identifier (UMI) in one read file and the read sequence in the other. Given just the transcriptome and the raw read files, alevin generates a cell-by-gene count matrix (in a fraction of the time compared to other tools).

Alevin works in two phases. In the first phase it quickly parses the read file containing the CB and UMI information to generate the frequency distribution of all the observed CBs, and creates a lightweight data-structure for fast-look up and correction of the CB. In the second round, alevin utilizes the read-sequences contained in the files to map the reads to the transcriptome, identify potential PCR/sequencing errors in the UMIs, and performs hybrid de-duplication while accounting for UMI collisions. Finally, a post-abundance estimation CB whitelisting procedure is done and a cell-by-gene count matrix is generated.

For further information regarding the tool and its optional parameters, visit the Alevin and Salmon wikis.