annotate list-chrom-cols.py @ 5:fb9c0409d85c draft

planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 608d9e59a0d2dcf85a037968ddb2c61137fb9bce
author prog
date Wed, 19 Apr 2017 10:00:05 -0400
parents 20d69a062da3
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
1
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
1 #!/usr/bin/env python
5
fb9c0409d85c planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 608d9e59a0d2dcf85a037968ddb2c61137fb9bce
prog
parents: 2
diff changeset
2 # vi: fdm=marker
1
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
3
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
4 import argparse
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
5 import subprocess
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
6 import re
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
7 import urllib2
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
8 import json
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
9 import csv
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
10
5
fb9c0409d85c planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 608d9e59a0d2dcf85a037968ddb2c61137fb9bce
prog
parents: 2
diff changeset
11 # Get chrom cols {{{1
fb9c0409d85c planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 608d9e59a0d2dcf85a037968ddb2c61137fb9bce
prog
parents: 2
diff changeset
12 ################################################################
fb9c0409d85c planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 608d9e59a0d2dcf85a037968ddb2c61137fb9bce
prog
parents: 2
diff changeset
13
fb9c0409d85c planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 608d9e59a0d2dcf85a037968ddb2c61137fb9bce
prog
parents: 2
diff changeset
14 def get_chrom_cols(dbtype, dburl, dbtoken = None, col_field = 'chromcol'):
1
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
15
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
16 cols = []
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
17
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
18 if dbtype == 'peakforest':
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
19 url = dburl + ( '' if dburl[-1] == '/' else '/' ) + 'metadata/lc/list-code-columns'
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
20 if dbtoken is not None:
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
21 url += '?token=' + dbtoken
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
22 result = urllib2.urlopen(url).read()
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
23 v = json.JSONDecoder().decode(result)
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
24 i = 0
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
25 for colid, coldesc in v.iteritems():
5
fb9c0409d85c planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 608d9e59a0d2dcf85a037968ddb2c61137fb9bce
prog
parents: 2
diff changeset
26 s = coldesc['name'] + ' - ' + coldesc['constructor'] + ' - L' + str(coldesc['length']) + ' - diam. ' + str(coldesc['diameter']) + ' - part. ' + str(coldesc['particule_size']) + ' - flow ' + str(coldesc['flow_rate'])
fb9c0409d85c planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 608d9e59a0d2dcf85a037968ddb2c61137fb9bce
prog
parents: 2
diff changeset
27 cols.append( (s , colid, i == 0) )
1
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
28 ++i
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
29
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
30 elif dbtype == 'inhouse':
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
31
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
32 # Get all column names from file
5
fb9c0409d85c planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 608d9e59a0d2dcf85a037968ddb2c61137fb9bce
prog
parents: 2
diff changeset
33 with open(dburl if isinstance(dburl, str) else dburl.get_file_name(), 'r') as dbfile:
1
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
34 reader = csv.reader(dbfile, delimiter = "\t", quotechar='"')
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
35 header = reader.next()
2
20d69a062da3 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit d4048accde6bdfd5b3e14f5394902d38991854f8
prog
parents: 1
diff changeset
36 if col_field in header:
20d69a062da3 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit d4048accde6bdfd5b3e14f5394902d38991854f8
prog
parents: 1
diff changeset
37 i = header.index(col_field)
20d69a062da3 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit d4048accde6bdfd5b3e14f5394902d38991854f8
prog
parents: 1
diff changeset
38 allcols = []
20d69a062da3 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit d4048accde6bdfd5b3e14f5394902d38991854f8
prog
parents: 1
diff changeset
39 for row in reader:
20d69a062da3 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit d4048accde6bdfd5b3e14f5394902d38991854f8
prog
parents: 1
diff changeset
40 col = row[i]
20d69a062da3 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit d4048accde6bdfd5b3e14f5394902d38991854f8
prog
parents: 1
diff changeset
41 if col not in allcols:
20d69a062da3 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit d4048accde6bdfd5b3e14f5394902d38991854f8
prog
parents: 1
diff changeset
42 allcols.append(col)
20d69a062da3 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit d4048accde6bdfd5b3e14f5394902d38991854f8
prog
parents: 1
diff changeset
43 for i, c in enumerate(allcols):
20d69a062da3 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit d4048accde6bdfd5b3e14f5394902d38991854f8
prog
parents: 1
diff changeset
44 cols.append( (c, c, i == 0) )
1
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
45
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
46 return cols
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
47
5
fb9c0409d85c planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 608d9e59a0d2dcf85a037968ddb2c61137fb9bce
prog
parents: 2
diff changeset
48 # Main {{{1
fb9c0409d85c planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 608d9e59a0d2dcf85a037968ddb2c61137fb9bce
prog
parents: 2
diff changeset
49 ################################################################
1
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
50
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
51 if __name__ == '__main__':
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
52
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
53 # Parse command line arguments
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
54 parser = argparse.ArgumentParser(description='Script for getting chromatographic columns of an RMSDB database for Galaxy tool lcmsmatching.')
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
55 parser.add_argument('-d', help = 'Database type', dest = 'dbtype', required = True)
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
56 parser.add_argument('-u', help = 'Database URL', dest = 'dburl', required = True)
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
57 parser.add_argument('-t', help = 'Database token', dest = 'dbtoken', required = False)
5
fb9c0409d85c planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 608d9e59a0d2dcf85a037968ddb2c61137fb9bce
prog
parents: 2
diff changeset
58 parser.add_argument('-f', help = 'Chromatogrphic column field name', dest = 'col_field', required = False)
1
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
59 args = parser.parse_args()
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
60 args_dict = vars(args)
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
61
253d531a0193 planemo upload for repository https://github.com/workflow4metabolomics/lcmsmatching.git commit 36c9d8099c20a1ae848f1337c16564335dd8fb2b
prog
parents:
diff changeset
62 print(get_chrom_cols(**args_dict))