Denoise and dereplicate single-end sequences
This method denoises single-end sequences, dereplicates them, and filters
chimeras.
Parameters
- demultiplexed_seqs : SampleData[SequencesWithQuality | PairedEndSequencesWithQuality]
- The single-end demultiplexed sequences to be denoised.
- trunc_len : Int
- Position at which sequences should be truncated due to decrease in
quality. This truncates the 3' end of the of the input sequences, which
will be the bases that were sequenced in the last cycles. Reads that
are shorter than this value will be discarded. If 0 is provided, no
truncation or length filtering will be performed
- trim_left : Int, optional
- Position at which sequences should be trimmed due to low quality. This
trims the 5' end of the of the input sequences, which will be the bases
that were sequenced in the first cycles.
- max_ee : Float, optional
- Reads with number of expected errors higher than this value will be
discarded.
- trunc_q : Int, optional
- Reads are truncated at the first instance of a quality score less than
or equal to this value. If the resulting read is then shorter than
trunc_len, it is discarded.
- chimera_method : Str % Choices('consensus', 'pooled', 'none'), optional
- The method used to remove chimeras. "none": No chimera removal is
performed. "pooled": All reads are pooled prior to chimera detection.
"consensus": Chimeras are detected in samples individually, and
sequences found chimeric in a sufficient fraction of samples are
removed.
- min_fold_parent_over_abundance : Float, optional
- The minimum abundance of potential parents of a sequence being tested
as chimeric, expressed as a fold-change versus the abundance of the
sequence being tested. Values should be greater than or equal to 1
(i.e. parents should be more abundant than the sequence being tested).
This parameter has no effect if chimera_method is "none".
- n_reads_learn : Int, optional
- The number of reads to use when training the error model. Smaller
numbers will result in a shorter run time but a less reliable error
model.
- hashed_feature_ids : Bool, optional
- If true, the feature ids in the resulting table will be presented as
hashes of the sequences defining each feature. The hash will always be
the same for the same sequence so this allows feature tables to be
merged across runs of this method. You should only merge tables if the
exact same parameters are used for each run.
Returns
- table : FeatureTable[Frequency]
- The resulting feature table.
- representative_sequences : FeatureData[Sequence]
- The resulting feature sequences. Each feature in the feature table will
be represented by exactly one sequence.
denoising_stats : SampleData[DADA2Stats]