Galaxy | Tool Preview

qiime quality-control exclude-seqs (version 2019.4)

Exclude sequences by alignment

This method aligns feature sequences to a set of reference sequences to identify sequences that hit/miss the reference within a specified perc_identity, evalue, and perc_query_aligned. This method could be used to define a positive filter, e.g., extract only feature sequences that align to a certain clade of bacteria; or to define a negative filter, e.g., identify sequences that align to contaminant or human DNA sequences that should be excluded from subsequent analyses. Note that filtering is performed based on the perc_identity, perc_query_aligned, and evalue thresholds (the latter only if method==BLAST and an evalue is set). Set perc_identity==0 and/or perc_query_aligned==0 to disable these filtering thresholds as necessary.

Parameters

query_sequences : FeatureData[Sequence]
Sequences to test for exclusion
reference_sequences : FeatureData[Sequence]
Reference sequences to align against feature sequences
method : Str % Choices('blast', 'vsearch', 'blastn-short'), optional
Alignment method to use for matching feature sequences against reference sequences
perc_identity : Float % Range(0.0, 1.0, inclusive_end=True), optional
Reject match if percent identity to reference is lower. Must be in range [0.0, 1.0]
evalue : Float, optional
BLAST expectation (E) value threshold for saving hits. Reject if E value is higher than threshold. This threshold is disabled by default.
perc_query_aligned : Float, optional
Percent of query sequence that must align to reference in order to be accepted as a hit.

Returns

sequence_hits : FeatureData[Sequence]
Subset of feature sequences that align to reference sequences
sequence_misses : FeatureData[Sequence]
Subset of feature sequences that do not align to reference sequences