Deblur sequences using a user-specified positive filter.
Perform sequence quality control for Illumina data using the Deblur
workflow, including positive alignment-based filtering. Only forward reads
are supported at this time. This mode of execution is particularly useful
when operating on non-16S data. For example, to apply Deblur to 18S data,
you would want to specify a reference composed of 18S sequences in order to
filter out sequences which do not appear to be 18S. The assessment is
performed by local alignment using SortMeRNA with a permissive e-value
threshold.
Parameters
- demultiplexed_seqs : SampleData[SequencesWithQuality | PairedEndSequencesWithQuality | JoinedSequencesWithQuality]
- The demultiplexed sequences to be denoised.
- reference_seqs : FeatureData[Sequence]
- Positive filtering database. Keep all sequences aligning to these
sequences.
- trim_length : Int
- Sequence trim length, specify -1 to disable trimming.
- left_trim_len : Int % Range(0, None), optional
- Sequence trimming from the 5' end. A value of 0 will disable this trim.
- sample_stats : Bool, optional
- If true, gather stats per sample.
- mean_error : Float, optional
- The mean per nucleotide error, used for original sequence estimate.
- indel_prob : Float, optional
- Insertion/deletion (indel) probability (same for N indels).
- indel_max : Int, optional
- Maximum number of insertion/deletions.
- min_reads : Int, optional
- Retain only features appearing at least min_reads times across all
samples in the resulting feature table.
- min_size : Int, optional
- In each sample, discard all features with an abundance less than
min_size.
- hashed_feature_ids : Bool, optional
- If true, hash the feature IDs.
Returns
- table : FeatureTable[Frequency]
- The resulting denoised feature table.
- representative_sequences : FeatureData[Sequence]
- The resulting feature sequences.
- stats : DeblurStats
- Per-sample stats if requested.