Mercurial > repos > peterjc > fasta_filter_by_id
comparison tools/fasta_tools/fasta_filter_by_id.py @ 1:5cd569750e85
Migrated tool version 0.0.3 from old tool shed archive to new tool shed repository
author | peterjc |
---|---|
date | Tue, 07 Jun 2011 17:22:48 -0400 |
parents | 2e5f8ad1a096 |
children | 5b552b3005f2 |
comparison
equal
deleted
inserted
replaced
0:2e5f8ad1a096 | 1:5cd569750e85 |
---|---|
1 #!/usr/bin/env python | 1 #!/usr/bin/env python |
2 """Filter a FASTA file with IDs from a tabular file, e.g. from BLAST. | 2 """Filter a FASTA file with IDs from a tabular file, e.g. from BLAST. |
3 | 3 |
4 Takes five command line options, tabular filename, ID column numbers | 4 Takes five command line options, tabular filename, ID column numbers |
5 (comma separated list using one based counting), input FASTA filename, and | 5 (comma separated list using one based counting), input FASTA filename, and |
6 two output FASTA filenames (for records with and without any BLAST hits). | 6 two output FASTA filenames (for records with and without the given IDs). |
7 If the either output filename is just a minus sign, that file is not created. | 7 |
8 If either output filename is just a minus sign, that file is not created. | |
8 This is intended to allow output for just the matched (or just the non-matched) | 9 This is intended to allow output for just the matched (or just the non-matched) |
9 records. | 10 records. |
10 | 11 |
11 Note in the default NCBI BLAST+ tabular output, the query sequence ID is | 12 Note in the default NCBI BLAST+ tabular output, the query sequence ID is |
12 in column one, and the ID of the match from the database is in column two. | 13 in column one, and the ID of the match from the database is in column two. |
49 if not line.startswith("#"): | 50 if not line.startswith("#"): |
50 ids.add(line.rstrip("\n").split("\t")[col]) | 51 ids.add(line.rstrip("\n").split("\t")[col]) |
51 print "Using %i IDs from tabular file" % (len(ids)) | 52 print "Using %i IDs from tabular file" % (len(ids)) |
52 handle.close() | 53 handle.close() |
53 | 54 |
54 #Write filtered FASTA file based on IDs from BLAST file | 55 #Write filtered FASTA file based on IDs from tabular file |
55 reader = fastaReader(open(in_file, "rU")) | 56 reader = fastaReader(open(in_file, "rU")) |
56 if out_positive_file != "-" and out_negative_file != "-": | 57 if out_positive_file != "-" and out_negative_file != "-": |
57 print "Generating two FASTA files" | 58 print "Generating two FASTA files" |
58 positive_writer = fastaWriter(open(out_positive_file, "w")) | 59 positive_writer = fastaWriter(open(out_positive_file, "w")) |
59 negative_writer = fastaWriter(open(out_negative_file, "w")) | 60 negative_writer = fastaWriter(open(out_negative_file, "w")) |