Filter FASTQ by ID (version 0.0.4)
This tool is now obsolete, and should not be used in future. It has been replaced by a more general version covering FASTA, FASTQ and SFF in one single tool.

What it does

By default it divides a FASTQ file in two, those sequences with or without an ID present in the tabular file column(s) specified. You can opt to have a single output file of just the matching records, or just the non-matching ones.

Note that the order of sequences in the original FASTA file is preserved. Also, if any sequences share an identifier, duplicates are not removed.

Example Usage

You may have performed some kind of contamination search, for example running BLASTN against a database of cloning vectors or bacteria, giving you a tabular file containing read identifiers. You could use this tool to extract only the reads without BLAST matches (i.e. those which do not match your contaminant database).