# HG changeset patch # User peterjc # Date 1478261468 14400 # Node ID fb1313d79396148034eebfbf20ef70fa23130cb5 # Parent 03e134cae41a7d518e89a70dc50ffb70313ec58a Uploaded v0.2.5, ignore blank names in tabular files (based on contribution from Gildas Le Corguille) diff -r 03e134cae41a -r fb1313d79396 tools/seq_filter_by_id/README.rst --- a/tools/seq_filter_by_id/README.rst Tue May 17 05:59:24 2016 -0400 +++ b/tools/seq_filter_by_id/README.rst Fri Nov 04 08:11:08 2016 -0400 @@ -89,6 +89,8 @@ v0.2.3 - Ignore blank lines in ID file (contributed by Gildas Le Corguillé). - Defensive quoting of filenames etc in the command definition (internal change only). +v0.2.4 - Corrected error message wording. +v0.2.5 - Ignore empty names, common in R output (Gildas Le Corguillé). ======= ====================================================================== diff -r 03e134cae41a -r fb1313d79396 tools/seq_filter_by_id/seq_filter_by_id.py --- a/tools/seq_filter_by_id/seq_filter_by_id.py Tue May 17 05:59:24 2016 -0400 +++ b/tools/seq_filter_by_id/seq_filter_by_id.py Fri Nov 04 08:11:08 2016 -0400 @@ -74,7 +74,7 @@ options, args = parser.parse_args() if options.version: - print "v0.2.3" + print "v0.2.5" sys.exit(0) in_file = options.input @@ -93,7 +93,7 @@ if logic not in ["UNION", "INTERSECTION"]: sys.exit("Logic agrument should be 'UNION' or 'INTERSECTION', not %r" % logic) if options.id_list and args: - sys.exit("Cannot accepted IDs via both -t and as tabular files") + sys.exit("Cannot accept IDs via both -t in the command line, and as tabular files") elif not options.id_list and not args: sys.exit("Expected matched pairs of tabular files and columns (or -t given)") if len(args) % 2: @@ -181,7 +181,7 @@ '\r': '__cr__', '\t': '__tc__', '#': '__pd__', - } +} # Read tabular file(s) and record all specified identifiers ids = None # Will be a set @@ -206,15 +206,19 @@ continue parts = line.rstrip("\n").split("\t") for col in columns: - file_ids.add(clean_name(parts[col])) + name = clean_name(parts[col]) + if name: + file_ids.add(name) else: # Single column, special case speed up col = columns[0] for line in handle: - if not line.strip(): #skip empty lines + if not line.strip(): # skip empty lines continue if not line.startswith("#"): - file_ids.add(clean_name(line.rstrip("\n").split("\t")[col])) + name = clean_name(line.rstrip("\n").split("\t")[col]) + if name: + file_ids.add(name) print "Using %i IDs from column %s in tabular file" % (len(file_ids), ", ".join(str(col + 1) for col in columns)) if ids is None: ids = file_ids diff -r 03e134cae41a -r fb1313d79396 tools/seq_filter_by_id/seq_filter_by_id.xml --- a/tools/seq_filter_by_id/seq_filter_by_id.xml Tue May 17 05:59:24 2016 -0400 +++ b/tools/seq_filter_by_id/seq_filter_by_id.xml Fri Nov 04 08:11:08 2016 -0400 @@ -1,8 +1,7 @@ - + from a tabular file biopython - Bio diff -r 03e134cae41a -r fb1313d79396 tools/seq_filter_by_id/tool_dependencies.xml --- a/tools/seq_filter_by_id/tool_dependencies.xml Tue May 17 05:59:24 2016 -0400 +++ b/tools/seq_filter_by_id/tool_dependencies.xml Fri Nov 04 08:11:08 2016 -0400 @@ -1,6 +1,6 @@ - +