comparison tools/fastq/fastq_filter_by_id.py @ 2:d570cc324779

Migrated tool version 0.0.4 from old tool shed archive to new tool shed repository
author peterjc
date Tue, 07 Jun 2011 17:24:08 -0400
parents 10e963c79a45
children
comparison
equal deleted inserted replaced
1:b79caa511ba2 2:d570cc324779
1 #!/usr/bin/env python 1 #!/usr/bin/env python
2 """Filter a FASTQ file with IDs from a tabular file, e.g. from BLAST. 2 """Filter a FASTQ file with IDs from a tabular file, e.g. from BLAST.
3
4 NOTE - This script is now OBSOLETE, having been replaced by a new verion
5 which handles FASTA, FASTQ and SFF all in one.
3 6
4 Takes five command line options, tabular filename, ID column numbers 7 Takes five command line options, tabular filename, ID column numbers
5 (comma separated list using one based counting), input FASTA filename, and 8 (comma separated list using one based counting), input FASTA filename, and
6 two output FASTA filenames (for records with and without the given IDs). 9 two output FASTA filenames (for records with and without the given IDs).
7 10
11 14
12 Note in the default NCBI BLAST+ tabular output, the query sequence ID is 15 Note in the default NCBI BLAST+ tabular output, the query sequence ID is
13 in column one, and the ID of the match from the database is in column two. 16 in column one, and the ID of the match from the database is in column two.
14 Here sensible values for the column numbers would therefore be "1" or "2". 17 Here sensible values for the column numbers would therefore be "1" or "2".
15 18
16 This script is copyright 2010 by Peter Cock, SCRI, UK. All rights reserved. 19 This script is copyright 2010-2011 by Peter Cock, SCRI, UK. All rights reserved.
17 See accompanying text file for licence details (MIT/BSD style). 20 See accompanying text file for licence details (MIT/BSD style).
18 21
19 This is version 0.0.2 of the script. 22 This is version 0.0.4 of the script.
20 """ 23 """
21 import sys 24 import sys
22 from galaxy_utils.sequence.fastq import fastqReader, fastqWriter 25 from galaxy_utils.sequence.fastq import fastqReader, fastqWriter
23 26
24 def stop_err( msg ): 27 def stop_err( msg ):
84 negative_writer = fastqWriter(open(out_negative_file, "w")) 87 negative_writer = fastqWriter(open(out_negative_file, "w"))
85 for record in reader: 88 for record in reader:
86 #The [1:] is because the fastaReader leaves the @ on the identifer. 89 #The [1:] is because the fastaReader leaves the @ on the identifer.
87 if not record.identifier or record.identifier.split()[0][1:] not in ids: 90 if not record.identifier or record.identifier.split()[0][1:] not in ids:
88 negative_writer.write(record) 91 negative_writer.write(record)
89 positive_writer.close()
90 negative_writer.close() 92 negative_writer.close()
91 else: 93 else:
92 stop_err("Neither output file requested") 94 stop_err("Neither output file requested")
93 reader.close() 95 reader.close()