Galaxy | Tool Preview

GEMINI stats (version 0.20.1)
Only files with version 0.20.1 are accepted.
If you select All variants the genotype counts will be produced using --summarize with the wildcard query "select * from variants".

What it does

The stats tool computes one of the following useful variant statistics for a GEMINI database:

Genotype counts tabulated by sample:

This mode uses the gemini stats --summarize option to produce a table with one row per sample, which tabulates the numbers of sites, for which a given sample shows a:

You can choose to calculate the table based on all variants in your database, or to filter the variants before the calculation using GEMINI genotype filter expressions and/or WHERE clauses of GEMINI queries.

Counts of SNPs by nucleotide change:

This runs gemini stats with the --snp-count option. The result is a simple table listing the number of occurences of each observed REF->ALT change in your database, e.g.:

type    count
A->G    2
C->T    1
G->A    1

Transition / transversion statistics

This mode uses gemini stats with the --tstv, --tstv-coding, or --tstv-noncoding option to compute the transition/transversion ratios for all SNPs, for SNPs in coding, or SNPs in non-coding regions, respectively.

The result is presented in a 1x3 table listing the number of transitions (ts column), transversions (tv column) and the ratio of the two (ts/tv column), e.g.:

ts    tv    ts/tv
126   39    3.2307

Alternate allele frequency spectrum

Runs gemini stats --sfs to produce binned alternate allele frequency counts in a table like:

aaf     count
0.125   2
0.375   1

Pairwise genetic distances

Runs gemini stats --mds and tabulates all pairwise genetic distance for the samples in your database. An example could look like this:

sample1  sample2  distance
M10500   M10500   0.0
M10475   M10478   1.25
M10500   M10475   2.0
M10500   M10478   0.5714