Galaxy | Tool Preview

seqtk_sample (version 1.4+galaxy0)
The seed used for the random number generator. Manually specifying a number here is useful for reproducing the same subsampling in different runs (e.g. read 1 and read 2)
Use an integer > 1 to select a specific number of reads. Use a decimal (e.g. 0.5) to select a fraction of reads
Advanced options
Advanced options 0

What it does

Takes a random subsample of FASTA or FASTQ sequences. The RNG is seedable to allow for reproducible results, and defaults to 4.

The subsample size can be a decimal fraction <=1, where 1 implies 100% of the reads should be used. If a number >1 is provided, that many reads will be taken from the dataset.

Two pass sampling mode reads the input file once to build a list of reads to output then again to output the reads. It is twice as slow, but uses much less RAM. It is only in effect when an integer number of reads (not a fraction) is specified as subsample size.

Attribution

This Galaxy tool relies on the seqtk toolkit from lh3/seqtk, developed by Heng Li at the Broad Institute