Divide a FASTA, FASTQ or SFF file in two, those sequences with or without an ID present in the specified column(s) of a tabular file. Example uses include filtering based on search results from a tool like NCBI BLAST, TMHMM, SignalP, or a read mapper. i.e. Split your sequences according to whether or not they have a BLAST match, transmembrane domain, signal peptide, or map to the reference sequence. This tool is a short Python script (using Biopython and Galaxy library functions). It requires Biopython to be installed. Note this tool replaces my three previously separate tools for FASTA, FASTA and SFF filtering by ID. |
hg clone https://toolshed.g2.bx.psu.edu/repos/peterjc/seq_filter_by_id
Name | Description | Version | Minimum Galaxy Version |
---|---|---|---|
from a tabular file | 0.2.9 | 16.01 |