seq_filter_by_id: tools/seq_filter_by_id/seq_filter_by

comparison tools/seq_filter_by_id/seq_filter_by_id.py @ 7:fb1313d79396 draft

Uploaded v0.2.5, ignore blank names in tabular files (based on contribution from Gildas Le Corguille)

author	peterjc
date	Fri, 04 Nov 2016 08:11:08 -0400
parents	03e134cae41a
children	2d4537dbf0bc

comparison

equal deleted inserted replaced

-:03e134cae41a
+:fb1313d79396
 help="Show version and quit")
 options, args = parser.parse_args()
 if options.version:
-print "v0.2.3"
+print "v0.2.5"
 sys.exit(0)
 in_file = options.input
 seq_format = options.format
 out_positive_file = options.output_positive
 if seq_format is None:
 sys.exit("Missing sequence format")
 if logic not in ["UNION", "INTERSECTION"]:
 sys.exit("Logic agrument should be 'UNION' or 'INTERSECTION', not %r" % logic)
 if options.id_list and args:
-sys.exit("Cannot accepted IDs via both -t and as tabular files")
+sys.exit("Cannot accept IDs via both -t in the command line, and as tabular files")
 elif not options.id_list and not args:
 sys.exit("Expected matched pairs of tabular files and columns (or -t given)")
 if len(args) % 2:
 sys.exit("Expected matched pairs of tabular files and columns, not: %r" % args)
 '@': '__at__',
 '\n': '__cn__',
 '\r': '__cr__',
 '\t': '__tc__',
 '#': '__pd__',
 }
 # Read tabular file(s) and record all specified identifiers
 ids = None  # Will be a set
 if options.id_list:
 assert not identifiers
 if line.startswith("#"):
 # Ignore comments
 continue
 parts = line.rstrip("\n").split("\t")
 for col in columns:
-file_ids.add(clean_name(parts[col]))
+name = clean_name(parts[col])
+if name:
+file_ids.add(name)
 else:
 # Single column, special case speed up
 col = columns[0]
 for line in handle:
-if not line.strip(): #skip empty lines
+if not line.strip():  # skip empty lines
 continue
 if not line.startswith("#"):
-file_ids.add(clean_name(line.rstrip("\n").split("\t")[col]))
+name = clean_name(line.rstrip("\n").split("\t")[col])
+if name:
+file_ids.add(name)
 print "Using %i IDs from column %s in tabular file" % (len(file_ids), ", ".join(str(col + 1) for col in columns))
 if ids is None:
 ids = file_ids
 if logic == "UNION":
 ids.update(file_ids)

Mercurial > repos > peterjc > seq_filter_by_id

comparison tools/seq_filter_by_id/seq_filter_by_id.py @ 7:fb1313d79396 draft