comparison toolfactory/test-data/input1_sample @ 50:807b8d103f18 draft default tip

Uploaded
author fubar
date Wed, 09 Jun 2021 02:28:01 +0000
parents 11a89a9f2f92
children
comparison
equal deleted inserted replaced
49:36d46b20095a 50:807b8d103f18
1 *WARNING before you start* 1 # see https://github.com/fubar2/toolfactory
2 2 #
3 Install this tool on a private Galaxy ONLY 3 # copyright ross lazarus (ross stop lazarus at gmail stop com) May 2012
4 Please NEVER on a public or production instance 4 #
5 5 # all rights reserved
6 Updated august 2014 by John Chilton adding citation support 6 # Licensed under the LGPL
7 7 # suggestions for improvement and bug fixes welcome at
8 Updated august 8 2014 to fix bugs reported by Marius van den Beek 8 # https://github.com/fubar2/toolfactory
9 9 #
10 Please cite the resource at 10 # April 2021: Refactored into two tools - generate and test/install
11 http://bioinformatics.oxfordjournals.org/cgi/reprint/bts573?ijkey=lczQh1sWrMwdYWJ&keytype=ref 11 # as part of GTN tutorial development and biocontainer adoption
12 if you use this tool in your published work. 12 # The tester runs planemo on a non-tested archive, creates the test outputs
13 13 # and returns a new proper tool with test.
14 **Short Story** 14
15 15
16 This is an unusual Galaxy tool capable of generating new Galaxy tools. 16
17 It works by exposing *unrestricted* and therefore extremely dangerous scripting 17 import argparse
18 to all designated administrators of the host Galaxy server, allowing them to 18 import copy
19 run scripts in R, python, sh and perl over multiple selected input data sets, 19 import fcntl
20 writing a single new data set as output. 20 import json
21 21 import os
22 *You have a working r/python/perl/bash script or any executable with positional or argparse style parameters* 22 import re
23 23 import shlex
24 It can be turned into an ordinary Galaxy tool in minutes, using a Galaxy tool. 24 import shutil
25 25 import subprocess
26 26 import sys
27 **Automated generation of new Galaxy tools for installation into any Galaxy** 27 import tarfile
28 28 import tempfile
29 A test is generated using small sample test data inputs and parameter settings you supply. 29 import time
30 Once the test case outputs have been produced, they can be used to build a 30
31 new Galaxy tool. The supplied script or executable is baked as a requirement 31 from bioblend import galaxy
32 into a new, ordinary Galaxy tool, fully workflow compatible out of the box. 32
33 Generated tools are installed via a tool shed by an administrator 33 import galaxyxml.tool as gxt
34 and work exactly like all other Galaxy tools for your users. 34 import galaxyxml.tool.parameters as gxtp
35 35
36 **More Detail** 36 import lxml.etree as ET
37 37
38 To use the ToolFactory, you should have prepared a script to paste into a 38 import yaml
39 text box, or have a package in mind and a small test input example ready to select from your history 39
40 to test your new script. 40 myversion = "V2.3 April 2021"
41 41 verbose = True
42 ```planemo test rgToolFactory2.xml --galaxy_root ~/galaxy --test_data ~/galaxy/tools/tool_makers/toolfactory/test-data``` works for me 42 debug = True
43 43 toolFactoryURL = "https://github.com/fubar2/toolfactory"
44 There is an example in each scripting language on the Tool Factory form. You 44 FAKEEXE = "~~~REMOVE~~~ME~~~"
45 can just cut and paste these to try it out - remember to select the right 45 # need this until a PR/version bump to fix galaxyxml prepending the exe even
46 interpreter please. You'll also need to create a small test data set using 46 # with override.
47 the Galaxy history add new data tool. 47
48 48
49 If the script fails somehow, use the "redo" button on the tool output in 49 def timenow():
50 your history to recreate the form complete with broken script. Fix the bug 50 """return current time as a string"""
51 and execute again. Rinse, wash, repeat. 51 return time.strftime("%d/%m/%Y %H:%M:%S", time.localtime(time.time()))
52 52
53 Once the script runs sucessfully, a new Galaxy tool that runs your script 53
54 can be generated. Select the "generate" option and supply some help text and 54 cheetah_escape_table = {"$": "\\$", "#": "\\#"}
55 names. The new tool will be generated in the form of a new Galaxy datatype 55
56 *toolshed.gz* - as the name suggests, it's an archive ready to upload to a 56
57 Galaxy ToolShed as a new tool repository. 57 def cheetah_escape(text):
58 58 """Produce entities within text."""
59 Once it's in a ToolShed, it can be installed into any local Galaxy server 59 return "".join([cheetah_escape_table.get(c, c) for c in text])
60 from the server administrative interface. 60
61 61
62 Once the new tool is installed, local users can run it - each time, the script 62 def parse_citations(citations_text):
63 that was supplied when it was built will be executed with the input chosen 63 """"""
64 from the user's history. In other words, the tools you generate with the 64 citations = [c for c in citations_text.split("**ENTRY**") if c.strip()]
65 ToolFactory run just like any other Galaxy tool,but run your script every time. 65 citation_tuples = []
66 66 for citation in citations:
67 Tool factory tools are perfect for workflow components. One input, one output, 67 if citation.startswith("doi"):
68 no variables. 68 citation_tuples.append(("doi", citation[len("doi") :].strip()))
69 69 else:
70 *To fully and safely exploit the awesome power* of this tool, 70 citation_tuples.append(("bibtex", citation[len("bibtex") :].strip()))
71 Galaxy and the ToolShed, you should be a developer installing this 71 return citation_tuples
72 tool on a private/personal/scratch local instance where you are an 72
73 admin_user. Then, if you break it, you get to keep all the pieces see 73 class Locker:
74 https://bitbucket.org/fubar/galaxytoolfactory/wiki/Home 74 """
75 75 multiple instances of the TF may try to update tool_conf.xml so use a simple lockfile
76 **Installation** 76 to prevent overwriting mix ups.
77 This is a Galaxy tool. You can install it most conveniently using the 77 """
78 administrative "Search and browse tool sheds" link. Find the Galaxy Main 78 def __enter__ (self):
79 toolshed at https://toolshed.g2.bx.psu.edu/ and search for the toolfactory 79 lockfile = "/tmp/.toolfactory_lockfile.lck"
80 repository. Open it and review the code and select the option to install it. 80 if not os.path.exists(lockfile):
81 81 try:
82 If you can't get the tool that way, the xml and py files here need to be 82 os.utime(lockfile, None)
83 copied into a new tools 83 except OSError:
84 subdirectory such as tools/toolfactory Your tool_conf.xml needs a new entry 84 open(lockfile, 'a').close()
85 pointing to the xml 85 self.fp = open(lockfile)
86 file - something like:: 86 fcntl.flock(self.fp.fileno(), fcntl.LOCK_EX)
87 87
88 <section name="Tool building tools" id="toolbuilders"> 88 def __exit__ (self, _type, value, tb):
89 <tool file="toolfactory/rgToolFactory.xml"/> 89 fcntl.flock(self.fp.fileno(), fcntl.LOCK_UN)
90 </section> 90 self.fp.close()
91 91
92 If not already there, 92
93 please add: 93 class Tool_Conf_Updater:
94 <datatype extension="toolshed.gz" type="galaxy.datatypes.binary:Binary" 94
95 mimetype="multipart/x-gzip" subclass="True" /> 95 """# update config/tool_conf.xml with a new tool unpacked in /tools
96 to your local data_types_conf.xml. 96 # requires highly insecure docker settings - like write to tool_conf.xml and to tools !
97 97 # if in a container possibly not so courageous.
98 98 # Fine on your own laptop but security red flag for most production instances
99 **Restricted execution** 99 Note potential race condition for tool_conf.xml update - uses a file lock.
100 100 """
101 The tool factory tool itself will then be usable ONLY by admin users - 101
102 people with IDs in admin_users in universe_wsgi.ini **Yes, that's right. ONLY 102 def __init__(
103 admin_users can run this tool** Think about it for a moment. If allowed to 103 self, args, tool_conf_path, new_tool_archive_path, new_tool_name, local_tool_dir, run_test
104 run any arbitrary script on your Galaxy server, the only thing that would 104 ):
105 impede a miscreant bent on destroying all your Galaxy data would probably 105 self.args = args
106 be lack of appropriate technical skills. 106 self.tool_conf_path = os.path.join(args.galaxy_root, tool_conf_path)
107 107 self.tool_dir = os.path.join(args.galaxy_root, local_tool_dir,'TFtools')
108 **What it does** 108 self.out_section = "ToolFactory Generated Tools"
109 109 tff = tarfile.open(new_tool_archive_path, "r:*")
110 This is a tool factory for simple scripts in python, R and 110 flist = tff.getnames()
111 perl currently. Functional tests are automatically generated. How cool is that. 111 ourdir = os.path.commonpath(flist) # eg pyrevpos
112 112 self.tool_id = ourdir # they are the same for TF tools
113 LIMITED to simple scripts that read one input from the history. Optionally can 113 ourxml = [x for x in flist if x.lower().endswith(".xml")]
114 write one new history dataset, and optionally collect any number of outputs 114 tff.extractall()
115 into links on an autogenerated HTML index page for the user to navigate - 115 tff.close()
116 useful if the script writes images and output files - pdf outputs are shown 116 self.run_rsync(ourdir, self.tool_dir)
117 as thumbnails and R's bloated pdf's are shrunk with ghostscript so that and 117 with Locker():
118 imagemagik need to be available. 118 self.update_toolconf(ourdir, ourxml)
119 119
120 Generated tools can be edited and enhanced like any Galaxy tool, so start 120 def run_rsync(self, srcf, dstf):
121 small and build up since a generated script gets you a serious leg up to a 121 src = os.path.abspath(srcf)
122 more complex one. 122 dst = os.path.abspath(dstf)
123 123 if os.path.isdir(src):
124 **What you do** 124 cll = ["rsync", "-r", src, dst]
125 125 else:
126 You paste and run your script, you fix the syntax errors and 126 cll = ["rsync", src, dst]
127 eventually it runs. You can use the redo button and edit the script before 127 subprocess.run(
128 trying to rerun it as you debug - it works pretty well. 128 cll,
129 129 capture_output=False,
130 Once the script works on some test data, you can generate a toolshed compatible 130 encoding="utf8",
131 gzip file containing your script ready to run as an ordinary Galaxy tool in 131 shell=False,
132 a repository on your local toolshed. That means safe and largely automated 132 )
133 installation in any production Galaxy configured to use your toolshed. 133
134 134 def update_toolconf(self, ourdir, ourxml): # path is relative to tools
135 **Generated tool Security** 135 localconf = "./local_tool_conf.xml"
136 136 self.run_rsync(self.tool_conf_path, localconf)
137 Once you install a generated tool, it's just 137 tree = ET.parse(localconf)
138 another tool - assuming the script is safe. They just run normally and their 138 root = tree.getroot()
139 user cannot do anything unusually insecure but please, practice safe toolshed. 139 hasTF = False
140 Read the code before you install any tool. Especially this one - it is really scary. 140 TFsection = None
141 141 for e in root.findall("section"):
142 **Send Code** 142 if e.attrib["name"] == self.out_section:
143 143 hasTF = True
144 Patches and suggestions welcome as bitbucket issues please? 144 TFsection = e
145 145 if not hasTF:
146 **Attribution** 146 TFsection = ET.Element("section", {"id":self.out_section, "name":self.out_section})
147 147 root.insert(0, TFsection) # at the top!
148 Creating re-usable tools from scripts: The Galaxy Tool Factory 148 our_tools = TFsection.findall("tool")
149 Ross Lazarus; Antony Kaspi; Mark Ziemann; The Galaxy Team 149 conf_tools = [x.attrib["file"] for x in our_tools]
150 Bioinformatics 2012; doi: 10.1093/bioinformatics/bts573 150 for xml in ourxml: # may be > 1
151 151 if xml not in conf_tools: # new
152 http://bioinformatics.oxfordjournals.org/cgi/reprint/bts573?ijkey=lczQh1sWrMwdYWJ&keytype=ref 152 ET.SubElement(TFsection, "tool", {"file": os.path.join('TFtools', xml)})
153 153 newconf = f"{self.tool_id}_conf"
154 **Licensing** 154 tree.write(newconf, pretty_print=True)
155 155 self.run_rsync(newconf, self.tool_conf_path)
156 Copyright Ross Lazarus 2010 156
157 ross lazarus at g mail period com 157
158 158
159 All rights reserved. 159
160 160 class Tool_Factory:
161 Licensed under the LGPL 161 """Wrapper for an arbitrary script
162 162 uses galaxyxml
163 **Obligatory screenshot** 163
164 164 """
165 http://bitbucket.org/fubar/galaxytoolmaker/src/fda8032fe989/images/dynamicScriptTool.png 165
166 166 def __init__(self, args=None): # noqa
167 """
168 prepare command line cl for running the tool here
169 and prepare elements needed for galaxyxml tool generation
170 """
171 self.ourcwd = os.getcwd()
172 self.collections = []
173 if len(args.collection) > 0:
174 try:
175 self.collections = [
176 json.loads(x) for x in args.collection if len(x.strip()) > 1
177 ]
178 except Exception:
179 print(
180 f"--collections parameter {str(args.collection)} is malformed - should be a dictionary"
181 )
182 try:
183 self.infiles = [
184 json.loads(x) for x in args.input_files if len(x.strip()) > 1
185 ]
186 except Exception:
187 print(
188 f"--input_files parameter {str(args.input_files)} is malformed - should be a dictionary"
189 )
190 try:
191 self.outfiles = [
192 json.loads(x) for x in args.output_files if len(x.strip()) > 1
193 ]
194 except Exception:
195 print(
196 f"--output_files parameter {args.output_files} is malformed - should be a dictionary"
197 )
198 try:
199 self.addpar = [
200 json.loads(x) for x in args.additional_parameters if len(x.strip()) > 1
201 ]
202 except Exception:
203 print(
204 f"--additional_parameters {args.additional_parameters} is malformed - should be a dictionary"
205 )
206 try:
207 self.selpar = [
208 json.loads(x) for x in args.selecttext_parameters if len(x.strip()) > 1
209 ]
210 except Exception:
211 print(
212 f"--selecttext_parameters {args.selecttext_parameters} is malformed - should be a dictionary"
213 )
214 self.args = args
215 self.cleanuppar()
216 self.lastxclredirect = None
217 self.xmlcl = []
218 self.is_positional = self.args.parampass == "positional"
219 if self.args.sysexe:
220 if " " in self.args.sysexe:
221 self.executeme = self.args.sysexe.split(" ")
222 else:
223 self.executeme = [
224 self.args.sysexe,
225 ]
226 else:
227 if self.args.packages:
228 self.executeme = [
229 self.args.packages.split(",")[0].split(":")[0].strip(),
230 ]
231 else:
232 self.executeme = None
233 aXCL = self.xmlcl.append
234 assert args.parampass in [
235 "0",
236 "argparse",
237 "positional",
238 ], 'args.parampass must be "0","positional" or "argparse"'
239 self.tool_name = re.sub("[^a-zA-Z0-9_]+", "", args.tool_name)
240 self.tool_id = self.tool_name
241 self.newtool = gxt.Tool(
242 self.tool_name,
243 self.tool_id,
244 self.args.tool_version,
245 self.args.tool_desc,
246 FAKEEXE,
247 )
248 self.tooloutdir = "./tfout"
249 self.repdir = "./toolgen"
250 self.newtarpath = os.path.join(self.tooloutdir, "%s_not_tested.toolshed.gz" % self.tool_name)
251 self.testdir = os.path.join(self.tooloutdir, "test-data")
252 if not os.path.exists(self.tooloutdir):
253 os.mkdir(self.tooloutdir)
254 if not os.path.exists(self.testdir):
255 os.mkdir(self.testdir)
256 if not os.path.exists(self.repdir):
257 os.mkdir(self.repdir)
258 self.tinputs = gxtp.Inputs()
259 self.toutputs = gxtp.Outputs()
260 self.testparam = []
261 if self.args.script_path:
262 self.prepScript()
263 if self.args.command_override:
264 scos = open(self.args.command_override, "r").readlines()
265 self.command_override = [x.rstrip() for x in scos]
266 else:
267 self.command_override = None
268 if self.args.test_override:
269 stos = open(self.args.test_override, "r").readlines()
270 self.test_override = [x.rstrip() for x in stos]
271 else:
272 self.test_override = None
273 if self.args.script_path:
274 for ex in self.executeme:
275 aXCL(ex)
276 aXCL("$runme")
277 else:
278 for ex in self.executeme:
279 aXCL(ex)
280
281 if self.args.parampass == "0":
282 self.clsimple()
283 else:
284 if self.args.parampass == "positional":
285 self.prepclpos()
286 self.clpositional()
287 else:
288 self.prepargp()
289 self.clargparse()
290
291 def clsimple(self):
292 """no parameters or repeats - uses < and > for i/o"""
293 aXCL = self.xmlcl.append
294 if len(self.infiles) > 0:
295 aXCL("<")
296 aXCL("$%s" % self.infiles[0]["infilename"])
297 if len(self.outfiles) > 0:
298 aXCL(">")
299 aXCL("$%s" % self.outfiles[0]["name"])
300 if self.args.cl_user_suffix: # DIY CL end
301 clp = shlex.split(self.args.cl_user_suffix)
302 for c in clp:
303 aXCL(c)
304
305 def prepargp(self):
306 xclsuffix = []
307 for i, p in enumerate(self.infiles):
308 nam = p["infilename"]
309 if p["origCL"].strip().upper() == "STDIN":
310 xappendme = [
311 nam,
312 nam,
313 "< $%s" % nam,
314 ]
315 else:
316 rep = p["repeat"] == "1"
317 over = ""
318 if rep:
319 over = f'#for $rep in $R_{nam}:\n--{nam} "$rep.{nam}"\n#end for'
320 xappendme = [p["CL"], "$%s" % p["CL"], over]
321 xclsuffix.append(xappendme)
322 for i, p in enumerate(self.outfiles):
323 if p["origCL"].strip().upper() == "STDOUT":
324 self.lastxclredirect = [">", "$%s" % p["name"]]
325 else:
326 xclsuffix.append([p["name"], "$%s" % p["name"], ""])
327 for p in self.addpar:
328 nam = p["name"]
329 rep = p["repeat"] == "1"
330 if rep:
331 over = f'#for $rep in $R_{nam}:\n--{nam} "$rep.{nam}"\n#end for'
332 else:
333 over = p["override"]
334 xclsuffix.append([p["CL"], '"$%s"' % nam, over])
335 for p in self.selpar:
336 xclsuffix.append([p["CL"], '"$%s"' % p["name"], p["override"]])
337 self.xclsuffix = xclsuffix
338
339 def prepclpos(self):
340 xclsuffix = []
341 for i, p in enumerate(self.infiles):
342 if p["origCL"].strip().upper() == "STDIN":
343 xappendme = [
344 "999",
345 p["infilename"],
346 "< $%s" % p["infilename"],
347 ]
348 else:
349 xappendme = [p["CL"], "$%s" % p["infilename"], ""]
350 xclsuffix.append(xappendme)
351 for i, p in enumerate(self.outfiles):
352 if p["origCL"].strip().upper() == "STDOUT":
353 self.lastxclredirect = [">", "$%s" % p["name"]]
354 else:
355 xclsuffix.append([p["CL"], "$%s" % p["name"], ""])
356 for p in self.addpar:
357 nam = p["name"]
358 rep = p["repeat"] == "1" # repeats make NO sense
359 if rep:
360 print(
361 f"### warning. Repeats for {nam} ignored - not permitted in positional parameter command lines!"
362 )
363 over = p["override"]
364 xclsuffix.append([p["CL"], '"$%s"' % nam, over])
365 for p in self.selpar:
366 xclsuffix.append([p["CL"], '"$%s"' % p["name"], p["override"]])
367 xclsuffix.sort()
368 self.xclsuffix = xclsuffix
369
370 def prepScript(self):
371 rx = open(self.args.script_path, "r").readlines()
372 rx = [x.rstrip() for x in rx]
373 rxcheck = [x.strip() for x in rx if x.strip() > ""]
374 assert len(rxcheck) > 0, "Supplied script is empty. Cannot run"
375 self.script = "\n".join(rx)
376 fhandle, self.sfile = tempfile.mkstemp(
377 prefix=self.tool_name, suffix="_%s" % (self.executeme[0])
378 )
379 tscript = open(self.sfile, "w")
380 tscript.write(self.script)
381 tscript.close()
382 self.spacedScript = [f" {x}" for x in rx if x.strip() > ""]
383 rx.insert(0, "#raw")
384 rx.append("#end raw")
385 self.escapedScript = rx
386 art = "%s.%s" % (self.tool_name, self.executeme[0])
387 artifact = open(art, "wb")
388 artifact.write(bytes(self.script, "utf8"))
389 artifact.close()
390
391 def cleanuppar(self):
392 """ positional parameters are complicated by their numeric ordinal"""
393 if self.args.parampass == "positional":
394 for i, p in enumerate(self.infiles):
395 assert (
396 p["CL"].isdigit() or p["CL"].strip().upper() == "STDIN"
397 ), "Positional parameters must be ordinal integers - got %s for %s" % (
398 p["CL"],
399 p["label"],
400 )
401 for i, p in enumerate(self.outfiles):
402 assert (
403 p["CL"].isdigit() or p["CL"].strip().upper() == "STDOUT"
404 ), "Positional parameters must be ordinal integers - got %s for %s" % (
405 p["CL"],
406 p["name"],
407 )
408 for i, p in enumerate(self.addpar):
409 assert p[
410 "CL"
411 ].isdigit(), "Positional parameters must be ordinal integers - got %s for %s" % (
412 p["CL"],
413 p["name"],
414 )
415 for i, p in enumerate(self.infiles):
416 infp = copy.copy(p)
417 infp["origCL"] = infp["CL"]
418 if self.args.parampass in ["positional", "0"]:
419 infp["infilename"] = infp["label"].replace(" ", "_")
420 else:
421 infp["infilename"] = infp["CL"]
422 self.infiles[i] = infp
423 for i, p in enumerate(self.outfiles):
424 outfp = copy.copy(p)
425 outfp["origCL"] = outfp["CL"] # keep copy
426 self.outfiles[i] = outfp
427 for i, p in enumerate(self.addpar):
428 addp = copy.copy(p)
429 addp["origCL"] = addp["CL"]
430 self.addpar[i] = addp
431
432 def clpositional(self):
433 # inputs in order then params
434 aXCL = self.xmlcl.append
435 for (k, v, koverride) in self.xclsuffix:
436 aXCL(v)
437 if self.lastxclredirect:
438 for cl in self.lastxclredirect:
439 aXCL(cl)
440 if self.args.cl_user_suffix: # DIY CL end
441 clp = shlex.split(self.args.cl_user_suffix)
442 for c in clp:
443 aXCL(c)
444
445 def clargparse(self):
446 """argparse style"""
447 aXCL = self.xmlcl.append
448 # inputs then params in argparse named form
449
450 for (k, v, koverride) in self.xclsuffix:
451 if koverride > "":
452 k = koverride
453 aXCL(k)
454 else:
455 if len(k.strip()) == 1:
456 k = "-%s" % k
457 else:
458 k = "--%s" % k
459 aXCL(k)
460 aXCL(v)
461 if self.lastxclredirect:
462 for cl in self.lastxclredirect:
463 aXCL(cl)
464 if self.args.cl_user_suffix: # DIY CL end
465 clp = shlex.split(self.args.cl_user_suffix)
466 for c in clp:
467 aXCL(c)
468
469 def getNdash(self, newname):
470 if self.is_positional:
471 ndash = 0
472 else:
473 ndash = 2
474 if len(newname) < 2:
475 ndash = 1
476 return ndash
477
478 def doXMLparam(self): # noqa
479 """Add all needed elements to tool"""
480 for p in self.outfiles:
481 newname = p["name"]
482 newfmt = p["format"]
483 newcl = p["CL"]
484 test = p["test"]
485 oldcl = p["origCL"]
486 test = test.strip()
487 ndash = self.getNdash(newcl)
488 aparm = gxtp.OutputData(
489 name=newname, format=newfmt, num_dashes=ndash, label=newname
490 )
491 aparm.positional = self.is_positional
492 if self.is_positional:
493 if oldcl.upper() == "STDOUT":
494 aparm.positional = 9999999
495 aparm.command_line_override = "> $%s" % newname
496 else:
497 aparm.positional = int(oldcl)
498 aparm.command_line_override = "$%s" % newname
499 self.toutputs.append(aparm)
500 ld = None
501 if test.strip() > "":
502 if test.startswith("diff"):
503 c = "diff"
504 ld = 0
505 if test.split(":")[1].isdigit:
506 ld = int(test.split(":")[1])
507 tp = gxtp.TestOutput(
508 name=newname,
509 value="%s_sample" % newname,
510 compare=c,
511 lines_diff=ld,
512 )
513 elif test.startswith("sim_size"):
514 c = "sim_size"
515 tn = test.split(":")[1].strip()
516 if tn > "":
517 if "." in tn:
518 delta = None
519 delta_frac = min(1.0, float(tn))
520 else:
521 delta = int(tn)
522 delta_frac = None
523 tp = gxtp.TestOutput(
524 name=newname,
525 value="%s_sample" % newname,
526 compare=c,
527 delta=delta,
528 delta_frac=delta_frac,
529 )
530 else:
531 c = test
532 tp = gxtp.TestOutput(
533 name=newname,
534 value="%s_sample" % newname,
535 compare=c,
536 )
537 self.testparam.append(tp)
538 for p in self.infiles:
539 newname = p["infilename"]
540 newfmt = p["format"]
541 ndash = self.getNdash(newname)
542 reps = p.get("repeat", "0") == "1"
543 if not len(p["label"]) > 0:
544 alab = p["CL"]
545 else:
546 alab = p["label"]
547 aninput = gxtp.DataParam(
548 newname,
549 optional=False,
550 label=alab,
551 help=p["help"],
552 format=newfmt,
553 multiple=False,
554 num_dashes=ndash,
555 )
556 aninput.positional = self.is_positional
557 if self.is_positional:
558 if p["origCL"].upper() == "STDIN":
559 aninput.positional = 9999998
560 aninput.command_line_override = "> $%s" % newname
561 else:
562 aninput.positional = int(p["origCL"])
563 aninput.command_line_override = "$%s" % newname
564 if reps:
565 repe = gxtp.Repeat(
566 name=f"R_{newname}", title=f"Add as many {alab} as needed"
567 )
568 repe.append(aninput)
569 self.tinputs.append(repe)
570 tparm = gxtp.TestRepeat(name=f"R_{newname}")
571 tparm2 = gxtp.TestParam(newname, value="%s_sample" % newname)
572 tparm.append(tparm2)
573 self.testparam.append(tparm)
574 else:
575 self.tinputs.append(aninput)
576 tparm = gxtp.TestParam(newname, value="%s_sample" % newname)
577 self.testparam.append(tparm)
578 for p in self.addpar:
579 newname = p["name"]
580 newval = p["value"]
581 newlabel = p["label"]
582 newhelp = p["help"]
583 newtype = p["type"]
584 newcl = p["CL"]
585 oldcl = p["origCL"]
586 reps = p["repeat"] == "1"
587 if not len(newlabel) > 0:
588 newlabel = newname
589 ndash = self.getNdash(newname)
590 if newtype == "text":
591 aparm = gxtp.TextParam(
592 newname,
593 label=newlabel,
594 help=newhelp,
595 value=newval,
596 num_dashes=ndash,
597 )
598 elif newtype == "integer":
599 aparm = gxtp.IntegerParam(
600 newname,
601 label=newlabel,
602 help=newhelp,
603 value=newval,
604 num_dashes=ndash,
605 )
606 elif newtype == "float":
607 aparm = gxtp.FloatParam(
608 newname,
609 label=newlabel,
610 help=newhelp,
611 value=newval,
612 num_dashes=ndash,
613 )
614 elif newtype == "boolean":
615 aparm = gxtp.BooleanParam(
616 newname,
617 label=newlabel,
618 help=newhelp,
619 value=newval,
620 num_dashes=ndash,
621 )
622 else:
623 raise ValueError(
624 'Unrecognised parameter type "%s" for\
625 additional parameter %s in makeXML'
626 % (newtype, newname)
627 )
628 aparm.positional = self.is_positional
629 if self.is_positional:
630 aparm.positional = int(oldcl)
631 if reps:
632 repe = gxtp.Repeat(
633 name=f"R_{newname}", title=f"Add as many {newlabel} as needed"
634 )
635 repe.append(aparm)
636 self.tinputs.append(repe)
637 tparm = gxtp.TestRepeat(name=f"R_{newname}")
638 tparm2 = gxtp.TestParam(newname, value=newval)
639 tparm.append(tparm2)
640 self.testparam.append(tparm)
641 else:
642 self.tinputs.append(aparm)
643 tparm = gxtp.TestParam(newname, value=newval)
644 self.testparam.append(tparm)
645 for p in self.selpar:
646 newname = p["name"]
647 newval = p["value"]
648 newlabel = p["label"]
649 newhelp = p["help"]
650 newtype = p["type"]
651 newcl = p["CL"]
652 if not len(newlabel) > 0:
653 newlabel = newname
654 ndash = self.getNdash(newname)
655 if newtype == "selecttext":
656 newtext = p["texts"]
657 aparm = gxtp.SelectParam(
658 newname,
659 label=newlabel,
660 help=newhelp,
661 num_dashes=ndash,
662 )
663 for i in range(len(newval)):
664 anopt = gxtp.SelectOption(
665 value=newval[i],
666 text=newtext[i],
667 )
668 aparm.append(anopt)
669 aparm.positional = self.is_positional
670 if self.is_positional:
671 aparm.positional = int(newcl)
672 self.tinputs.append(aparm)
673 tparm = gxtp.TestParam(newname, value=newval)
674 self.testparam.append(tparm)
675 else:
676 raise ValueError(
677 'Unrecognised parameter type "%s" for\
678 selecttext parameter %s in makeXML'
679 % (newtype, newname)
680 )
681 for p in self.collections:
682 newkind = p["kind"]
683 newname = p["name"]
684 newlabel = p["label"]
685 newdisc = p["discover"]
686 collect = gxtp.OutputCollection(newname, label=newlabel, type=newkind)
687 disc = gxtp.DiscoverDatasets(
688 pattern=newdisc, directory=f"{newname}", visible="false"
689 )
690 collect.append(disc)
691 self.toutputs.append(collect)
692 try:
693 tparm = gxtp.TestOutputCollection(newname) # broken until PR merged.
694 self.testparam.append(tparm)
695 except Exception:
696 print(
697 "#### WARNING: Galaxyxml version does not have the PR merged yet - tests for collections must be over-ridden until then!"
698 )
699
700 def doNoXMLparam(self):
701 """filter style package - stdin to stdout"""
702 if len(self.infiles) > 0:
703 alab = self.infiles[0]["label"]
704 if len(alab) == 0:
705 alab = self.infiles[0]["infilename"]
706 max1s = (
707 "Maximum one input if parampass is 0 but multiple input files supplied - %s"
708 % str(self.infiles)
709 )
710 assert len(self.infiles) == 1, max1s
711 newname = self.infiles[0]["infilename"]
712 aninput = gxtp.DataParam(
713 newname,
714 optional=False,
715 label=alab,
716 help=self.infiles[0]["help"],
717 format=self.infiles[0]["format"],
718 multiple=False,
719 num_dashes=0,
720 )
721 aninput.command_line_override = "< $%s" % newname
722 aninput.positional = True
723 self.tinputs.append(aninput)
724 tp = gxtp.TestParam(name=newname, value="%s_sample" % newname)
725 self.testparam.append(tp)
726 if len(self.outfiles) > 0:
727 newname = self.outfiles[0]["name"]
728 newfmt = self.outfiles[0]["format"]
729 anout = gxtp.OutputData(newname, format=newfmt, num_dashes=0)
730 anout.command_line_override = "> $%s" % newname
731 anout.positional = self.is_positional
732 self.toutputs.append(anout)
733 tp = gxtp.TestOutput(name=newname, value="%s_sample" % newname)
734 self.testparam.append(tp)
735
736 def makeXML(self): # noqa
737 """
738 Create a Galaxy xml tool wrapper for the new script
739 Uses galaxyhtml
740 Hmmm. How to get the command line into correct order...
741 """
742 if self.command_override:
743 self.newtool.command_override = self.command_override # config file
744 else:
745 self.newtool.command_override = self.xmlcl
746 cite = gxtp.Citations()
747 acite = gxtp.Citation(type="doi", value="10.1093/bioinformatics/bts573")
748 cite.append(acite)
749 self.newtool.citations = cite
750 safertext = ""
751 if self.args.help_text:
752 helptext = open(self.args.help_text, "r").readlines()
753 safertext = "\n".join([cheetah_escape(x) for x in helptext])
754 if len(safertext.strip()) == 0:
755 safertext = (
756 "Ask the tool author (%s) to rebuild with help text please\n"
757 % (self.args.user_email)
758 )
759 if self.args.script_path:
760 if len(safertext) > 0:
761 safertext = safertext + "\n\n------\n" # transition allowed!
762 scr = [x for x in self.spacedScript if x.strip() > ""]
763 scr.insert(0, "\n\nScript::\n")
764 if len(scr) > 300:
765 scr = (
766 scr[:100]
767 + [" >300 lines - stuff deleted", " ......"]
768 + scr[-100:]
769 )
770 scr.append("\n")
771 safertext = safertext + "\n".join(scr)
772 self.newtool.help = safertext
773 self.newtool.version_command = f'echo "{self.args.tool_version}"'
774 std = gxtp.Stdios()
775 std1 = gxtp.Stdio()
776 std.append(std1)
777 self.newtool.stdios = std
778 requirements = gxtp.Requirements()
779 if self.args.packages:
780 try:
781 for d in self.args.packages.split(","):
782 ver = None
783 packg = None
784 d = d.replace("==", ":")
785 d = d.replace("=", ":")
786 if ":" in d:
787 packg, ver = d.split(":")
788 ver = ver.strip()
789 packg = packg.strip()
790 else:
791 packg = d.strip()
792 ver = None
793 if ver == "":
794 ver = None
795 if packg:
796 requirements.append(
797 gxtp.Requirement("package", packg.strip(), ver)
798 )
799 except Exception:
800 print(
801 "### malformed packages string supplied - cannot parse =",
802 self.args.packages,
803 )
804 sys.exit(2)
805 self.newtool.requirements = requirements
806 if self.args.parampass == "0":
807 self.doNoXMLparam()
808 else:
809 self.doXMLparam()
810 self.newtool.outputs = self.toutputs
811 self.newtool.inputs = self.tinputs
812 if self.args.script_path:
813 configfiles = gxtp.Configfiles()
814 configfiles.append(
815 gxtp.Configfile(name="runme", text="\n".join(self.escapedScript))
816 )
817 self.newtool.configfiles = configfiles
818 tests = gxtp.Tests()
819 test_a = gxtp.Test()
820 for tp in self.testparam:
821 test_a.append(tp)
822 tests.append(test_a)
823 self.newtool.tests = tests
824 self.newtool.add_comment(
825 "Created by %s at %s using the Galaxy Tool Factory."
826 % (self.args.user_email, timenow())
827 )
828 self.newtool.add_comment("Source in git at: %s" % (toolFactoryURL))
829 exml0 = self.newtool.export()
830 exml = exml0.replace(FAKEEXE, "") # temporary work around until PR accepted
831 if (
832 self.test_override
833 ): # cannot do this inside galaxyxml as it expects lxml objects for tests
834 part1 = exml.split("<tests>")[0]
835 part2 = exml.split("</tests>")[1]
836 fixed = "%s\n%s\n%s" % (part1, "\n".join(self.test_override), part2)
837 exml = fixed
838 # exml = exml.replace('range="1:"', 'range="1000:"')
839 with open("%s.xml" % self.tool_name, "w") as xf:
840 xf.write(exml)
841 xf.write("\n")
842 with open(self.args.untested_tool_out, 'w') as outf:
843 outf.write(exml)
844 outf.write('\n')
845 # ready for the tarball
846
847 def writeShedyml(self):
848 """for planemo"""
849 yuser = self.args.user_email.split("@")[0]
850 yfname = os.path.join(self.tooloutdir, ".shed.yml")
851 yamlf = open(yfname, "w")
852 odict = {
853 "name": self.tool_name,
854 "owner": yuser,
855 "type": "unrestricted",
856 "description": self.args.tool_desc,
857 "synopsis": self.args.tool_desc,
858 "category": "TF Generated Tools",
859 }
860 yaml.dump(odict, yamlf, allow_unicode=True)
861 yamlf.close()
862
863 def makeTool(self):
864 """write xmls and input samples into place"""
865 if self.args.parampass == 0:
866 self.doNoXMLparam()
867 else:
868 self.makeXML()
869 if self.args.script_path:
870 stname = os.path.join(self.tooloutdir, self.sfile)
871 if not os.path.exists(stname):
872 shutil.copyfile(self.sfile, stname)
873 xreal = "%s.xml" % self.tool_name
874 xout = os.path.join(self.tooloutdir, xreal)
875 shutil.copyfile(xreal, xout)
876 #xout = os.path.join(self.repdir, xreal)
877 #shutil.copyfile(xreal, xout)
878 for p in self.infiles:
879 pth = p["name"]
880 dest = os.path.join(self.testdir, "%s_sample" % p["infilename"])
881 shutil.copyfile(pth, dest)
882 dest = os.path.join(
883 self.repdir, "%s_sample.%s" % (p["infilename"], p["format"])
884 )
885 shutil.copyfile(pth, dest)
886
887 def makeToolTar(self, report_fail=False):
888 """move outputs into test-data and prepare the tarball"""
889 excludeme = "_planemo_test_report.html"
890
891 def exclude_function(tarinfo):
892 filename = tarinfo.name
893 return None if filename.endswith(excludeme) else tarinfo
894
895 for p in self.outfiles:
896 oname = p["name"]
897 tdest = os.path.join(self.testdir, "%s_sample" % oname)
898 src = os.path.join(self.testdir, oname)
899 if not os.path.isfile(tdest):
900 if os.path.isfile(src):
901 shutil.copyfile(src, tdest)
902 dest = os.path.join(self.repdir, "%s.sample.%s" % (oname,p['format']))
903 shutil.copyfile(src, dest)
904 else:
905 if report_fail:
906 print(
907 "###Tool may have failed - output file %s not found in testdir after planemo run %s."
908 % (tdest, self.testdir)
909 )
910 tf = tarfile.open(self.newtarpath, "w:gz")
911 tf.add(
912 name=self.tooloutdir,
913 arcname=self.tool_name,
914 filter=exclude_function,
915 )
916 shutil.copy(self.newtarpath, os.path.join(self.tooloutdir, f"{self.tool_name}_untested.toolshed.gz"))
917 tf.close()
918
919
920 def main():
921 """
922 This is a Galaxy wrapper.
923 It expects to be called by a special purpose tool.xml
924
925 """
926 parser = argparse.ArgumentParser()
927 a = parser.add_argument
928 a("--script_path", default=None)
929 a("--history_test", default=None)
930 a("--cl_user_suffix", default=None)
931 a("--sysexe", default=None)
932 a("--packages", default=None)
933 a("--tool_name", default="newtool")
934 a("--tool_dir", default=None)
935 a("--input_files", default=[], action="append")
936 a("--output_files", default=[], action="append")
937 a("--user_email", default="Unknown")
938 a("--bad_user", default=None)
939 a("--help_text", default=None)
940 a("--tool_desc", default=None)
941 a("--tool_version", default=None)
942 a("--citations", default=None)
943 a("--command_override", default=None)
944 a("--test_override", default=None)
945 a("--additional_parameters", action="append", default=[])
946 a("--selecttext_parameters", action="append", default=[])
947 a("--edit_additional_parameters", action="store_true", default=False)
948 a("--parampass", default="positional")
949 a("--tfout", default="./tfout")
950 a("--galaxy_root", default="/galaxy-central")
951 a("--galaxy_venv", default="/galaxy_venv")
952 a("--collection", action="append", default=[])
953 a("--include_tests", default=False, action="store_true")
954 a("--install", default="1")
955 a("--admin_only", default=False, action="store_true")
956 a("--untested_tool_out", default=None)
957 a("--local_tools", default="tools") # relative to $__root_dir__
958 a("--tool_conf_path", default="config/tool_conf.xml") # relative to $__root_dir__
959 args = parser.parse_args()
960 if args.admin_only:
961 assert not args.bad_user, (
962 'UNAUTHORISED: %s is NOT authorized to use this tool until Galaxy \
963 admin adds %s to "admin_users" in the galaxy.yml Galaxy configuration file'
964 % (args.bad_user, args.bad_user)
965 )
966 assert args.tool_name, "## Tool Factory expects a tool name - eg --tool_name=DESeq"
967 r = Tool_Factory(args)
968 r.writeShedyml()
969 r.makeTool()
970 r.makeToolTar()
971 if args.install == "1":
972 TCU = Tool_Conf_Updater(
973 args=args,
974 local_tool_dir=args.local_tools,
975 new_tool_archive_path=r.newtarpath,
976 tool_conf_path=args.tool_conf_path,
977 new_tool_name=r.tool_name,
978 run_test = args.run_test
979 )
980
981 if __name__ == "__main__":
982 main()