annotate tools/seq_select_by_id/seq_select_by_id.py @ 8:8e1a90917fa7 draft

v0.0.13 Python 3 compatible exception handling
author peterjc
date Wed, 17 May 2017 09:23:03 -0400
parents a5602454b0ad
children 3b0a14722175
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
1 #!/usr/bin/env python
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
2 """Select FASTA, QUAL, FASTQ or SSF sequences by IDs from a tabular file.
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
3
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
4 Takes five command line options, tabular filename, ID column number (using
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
5 one based counting), input filename, input type (e.g. FASTA or SFF) and the
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
6 output filename (same format as input sequence file).
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
7
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
8 When selecting from an SFF file, any Roche XML manifest in the input file is
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
9 preserved in both output files.
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
10
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
11 This tool is a short Python script which requires Biopython 1.54 or later
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
12 for SFF file support. If you use this tool in scientific work leading to a
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
13 publication, please cite the Biopython application note:
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
14
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
15 Cock et al 2009. Biopython: freely available Python tools for computational
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
16 molecular biology and bioinformatics. Bioinformatics 25(11) 1422-3.
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
17 http://dx.doi.org/10.1093/bioinformatics/btp163 pmid:19304878.
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
18
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
19 This script is copyright 2011-2017 by Peter Cock, The James Hutton Institute UK.
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
20 All rights reserved. See accompanying text file for licence details (MIT
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
21 license).
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
22 """
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
23
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
24 from __future__ import print_function
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
25
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
26 import sys
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
27
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
28 if "-v" in sys.argv or "--version" in sys.argv:
8
8e1a90917fa7 v0.0.13 Python 3 compatible exception handling
peterjc
parents: 7
diff changeset
29 print("v0.0.13")
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
30 sys.exit(0)
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
31
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
32 # Parse Command Line
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
33 try:
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
34 tabular_file, col_arg, in_file, seq_format, out_file = sys.argv[1:]
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
35 except ValueError:
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
36 sys.exit("Expected five arguments, got %i:\n%s" % (len(sys.argv) - 1, " ".join(sys.argv)))
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
37 try:
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
38 if col_arg.startswith("c"):
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
39 column = int(col_arg[1:]) - 1
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
40 else:
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
41 column = int(col_arg) - 1
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
42 except ValueError:
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
43 sys.exit("Expected column number, got %s" % col_arg)
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
44
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
45 if seq_format == "fastqcssanger":
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
46 sys.exit("Colorspace FASTQ not supported.")
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
47 elif seq_format.lower() in ["sff", "fastq", "qual", "fasta"]:
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
48 seq_format = seq_format.lower()
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
49 elif seq_format.lower().startswith("fastq"):
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
50 # We don't care how the qualities are encoded
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
51 seq_format = "fastq"
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
52 elif seq_format.lower().startswith("qual"):
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
53 # We don't care what the scores are
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
54 seq_format = "qual"
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
55 else:
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
56 sys.exit("Unrecognised file format %r" % seq_format)
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
57
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
58
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
59 try:
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
60 from Bio import SeqIO
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
61 except ImportError:
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
62 sys.exit("Biopython 1.54 or later is required")
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
63
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
64
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
65 def parse_ids(tabular_file, col):
6
91f55ee8fea5 v0.0.11; more tests and assorting minor changes
peterjc
parents: 4
diff changeset
66 """Read tabular file and record all specified identifiers.
91f55ee8fea5 v0.0.11; more tests and assorting minor changes
peterjc
parents: 4
diff changeset
67
91f55ee8fea5 v0.0.11; more tests and assorting minor changes
peterjc
parents: 4
diff changeset
68 Will print a single warning to stderr if any of the fields have
91f55ee8fea5 v0.0.11; more tests and assorting minor changes
peterjc
parents: 4
diff changeset
69 non-trailing white space (only the first word will be used as
91f55ee8fea5 v0.0.11; more tests and assorting minor changes
peterjc
parents: 4
diff changeset
70 the identifier).
91f55ee8fea5 v0.0.11; more tests and assorting minor changes
peterjc
parents: 4
diff changeset
71 """
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
72 handle = open(tabular_file, "rU")
6
91f55ee8fea5 v0.0.11; more tests and assorting minor changes
peterjc
parents: 4
diff changeset
73 warn = False
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
74 for line in handle:
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
75 if line.strip() and not line.startswith("#"):
6
91f55ee8fea5 v0.0.11; more tests and assorting minor changes
peterjc
parents: 4
diff changeset
76 field = line.rstrip("\n").split("\t")[col].strip()
91f55ee8fea5 v0.0.11; more tests and assorting minor changes
peterjc
parents: 4
diff changeset
77 parts = field.split(None, 1)
91f55ee8fea5 v0.0.11; more tests and assorting minor changes
peterjc
parents: 4
diff changeset
78 if len(parts) > 1 and not warn:
91f55ee8fea5 v0.0.11; more tests and assorting minor changes
peterjc
parents: 4
diff changeset
79 warn = "WARNING: Some of your identifiers had white space in them, " + \
91f55ee8fea5 v0.0.11; more tests and assorting minor changes
peterjc
parents: 4
diff changeset
80 "using first word only. e.g.:\n%s\n" % field
91f55ee8fea5 v0.0.11; more tests and assorting minor changes
peterjc
parents: 4
diff changeset
81 yield parts[0]
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
82 handle.close()
6
91f55ee8fea5 v0.0.11; more tests and assorting minor changes
peterjc
parents: 4
diff changeset
83 if warn:
91f55ee8fea5 v0.0.11; more tests and assorting minor changes
peterjc
parents: 4
diff changeset
84 sys.stderr.write(warn)
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
85
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
86
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
87 # Index the sequence file.
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
88 # If very big, could use SeqIO.index_db() to avoid memory bottleneck...
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
89 records = SeqIO.index(in_file, seq_format)
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
90 print("Indexed %i sequences" % len(records))
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
91
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
92 if seq_format.lower() == "sff":
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
93 # Special case to try to preserve the XML manifest
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
94 try:
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
95 from Bio.SeqIO.SffIO import SffWriter
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
96 except ImportError:
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
97 sys.exit("Requires Biopython 1.54 or later")
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
98
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
99 try:
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
100 from Bio.SeqIO.SffIO import ReadRocheXmlManifest
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
101 except ImportError:
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
102 # Prior to Biopython 1.56 this was a private function
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
103 from Bio.SeqIO.SffIO import _sff_read_roche_index_xml as ReadRocheXmlManifest
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
104
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
105 in_handle = open(in_file, "rb") # must be binary mode!
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
106 try:
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
107 manifest = ReadRocheXmlManifest(in_handle)
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
108 except ValueError:
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
109 manifest = None
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
110 in_handle.close()
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
111
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
112 out_handle = open(out_file, "wb")
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
113 writer = SffWriter(out_handle, xml=manifest)
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
114 count = 0
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
115 # This does have the overhead of parsing into SeqRecord objects,
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
116 # but doing the header and index at the low level is too fidly.
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
117 name = None # We want the variable to leak from the iterator's scope...
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
118 iterator = (records[name] for name in parse_ids(tabular_file, column))
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
119 try:
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
120 count = writer.write_file(iterator)
8
8e1a90917fa7 v0.0.13 Python 3 compatible exception handling
peterjc
parents: 7
diff changeset
121 except KeyError:
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
122 out_handle.close()
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
123 if name not in records:
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
124 sys.exit("Identifier %r not found in sequence file" % name)
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
125 else:
8
8e1a90917fa7 v0.0.13 Python 3 compatible exception handling
peterjc
parents: 7
diff changeset
126 raise
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
127 out_handle.close()
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
128 else:
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
129 # Avoid overhead of parsing into SeqRecord objects,
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
130 # just re-use the original formatting from the input file.
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
131 out_handle = open(out_file, "w")
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
132 count = 0
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
133 for name in parse_ids(tabular_file, column):
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
134 try:
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
135 out_handle.write(records.get_raw(name))
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
136 except KeyError:
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
137 out_handle.close()
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
138 sys.exit("Identifier %r not found in sequence file" % name)
4
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
139 count += 1
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
140 out_handle.close()
6842c0c7bc70 Uploaded v0.0.7, depend on Biopython 1.62, tabs to spaces in XML
peterjc
parents:
diff changeset
141
7
a5602454b0ad v0.0.12 Depends on Biopython 1.67 via legacy Tool Shed package or bioconda; Python 3 compatible print function
peterjc
parents: 6
diff changeset
142 print("Selected %i sequences by ID" % count)