Mercurial > repos > fubar > toolfactory
comparison toolfactory/test-data/input1_sample @ 50:807b8d103f18 draft default tip
Uploaded
author | fubar |
---|---|
date | Wed, 09 Jun 2021 02:28:01 +0000 |
parents | 11a89a9f2f92 |
children |
comparison
equal
deleted
inserted
replaced
49:36d46b20095a | 50:807b8d103f18 |
---|---|
1 *WARNING before you start* | 1 # see https://github.com/fubar2/toolfactory |
2 | 2 # |
3 Install this tool on a private Galaxy ONLY | 3 # copyright ross lazarus (ross stop lazarus at gmail stop com) May 2012 |
4 Please NEVER on a public or production instance | 4 # |
5 | 5 # all rights reserved |
6 Updated august 2014 by John Chilton adding citation support | 6 # Licensed under the LGPL |
7 | 7 # suggestions for improvement and bug fixes welcome at |
8 Updated august 8 2014 to fix bugs reported by Marius van den Beek | 8 # https://github.com/fubar2/toolfactory |
9 | 9 # |
10 Please cite the resource at | 10 # April 2021: Refactored into two tools - generate and test/install |
11 http://bioinformatics.oxfordjournals.org/cgi/reprint/bts573?ijkey=lczQh1sWrMwdYWJ&keytype=ref | 11 # as part of GTN tutorial development and biocontainer adoption |
12 if you use this tool in your published work. | 12 # The tester runs planemo on a non-tested archive, creates the test outputs |
13 | 13 # and returns a new proper tool with test. |
14 **Short Story** | 14 |
15 | 15 |
16 This is an unusual Galaxy tool capable of generating new Galaxy tools. | 16 |
17 It works by exposing *unrestricted* and therefore extremely dangerous scripting | 17 import argparse |
18 to all designated administrators of the host Galaxy server, allowing them to | 18 import copy |
19 run scripts in R, python, sh and perl over multiple selected input data sets, | 19 import fcntl |
20 writing a single new data set as output. | 20 import json |
21 | 21 import os |
22 *You have a working r/python/perl/bash script or any executable with positional or argparse style parameters* | 22 import re |
23 | 23 import shlex |
24 It can be turned into an ordinary Galaxy tool in minutes, using a Galaxy tool. | 24 import shutil |
25 | 25 import subprocess |
26 | 26 import sys |
27 **Automated generation of new Galaxy tools for installation into any Galaxy** | 27 import tarfile |
28 | 28 import tempfile |
29 A test is generated using small sample test data inputs and parameter settings you supply. | 29 import time |
30 Once the test case outputs have been produced, they can be used to build a | 30 |
31 new Galaxy tool. The supplied script or executable is baked as a requirement | 31 from bioblend import galaxy |
32 into a new, ordinary Galaxy tool, fully workflow compatible out of the box. | 32 |
33 Generated tools are installed via a tool shed by an administrator | 33 import galaxyxml.tool as gxt |
34 and work exactly like all other Galaxy tools for your users. | 34 import galaxyxml.tool.parameters as gxtp |
35 | 35 |
36 **More Detail** | 36 import lxml.etree as ET |
37 | 37 |
38 To use the ToolFactory, you should have prepared a script to paste into a | 38 import yaml |
39 text box, or have a package in mind and a small test input example ready to select from your history | 39 |
40 to test your new script. | 40 myversion = "V2.3 April 2021" |
41 | 41 verbose = True |
42 ```planemo test rgToolFactory2.xml --galaxy_root ~/galaxy --test_data ~/galaxy/tools/tool_makers/toolfactory/test-data``` works for me | 42 debug = True |
43 | 43 toolFactoryURL = "https://github.com/fubar2/toolfactory" |
44 There is an example in each scripting language on the Tool Factory form. You | 44 FAKEEXE = "~~~REMOVE~~~ME~~~" |
45 can just cut and paste these to try it out - remember to select the right | 45 # need this until a PR/version bump to fix galaxyxml prepending the exe even |
46 interpreter please. You'll also need to create a small test data set using | 46 # with override. |
47 the Galaxy history add new data tool. | 47 |
48 | 48 |
49 If the script fails somehow, use the "redo" button on the tool output in | 49 def timenow(): |
50 your history to recreate the form complete with broken script. Fix the bug | 50 """return current time as a string""" |
51 and execute again. Rinse, wash, repeat. | 51 return time.strftime("%d/%m/%Y %H:%M:%S", time.localtime(time.time())) |
52 | 52 |
53 Once the script runs sucessfully, a new Galaxy tool that runs your script | 53 |
54 can be generated. Select the "generate" option and supply some help text and | 54 cheetah_escape_table = {"$": "\\$", "#": "\\#"} |
55 names. The new tool will be generated in the form of a new Galaxy datatype | 55 |
56 *toolshed.gz* - as the name suggests, it's an archive ready to upload to a | 56 |
57 Galaxy ToolShed as a new tool repository. | 57 def cheetah_escape(text): |
58 | 58 """Produce entities within text.""" |
59 Once it's in a ToolShed, it can be installed into any local Galaxy server | 59 return "".join([cheetah_escape_table.get(c, c) for c in text]) |
60 from the server administrative interface. | 60 |
61 | 61 |
62 Once the new tool is installed, local users can run it - each time, the script | 62 def parse_citations(citations_text): |
63 that was supplied when it was built will be executed with the input chosen | 63 """""" |
64 from the user's history. In other words, the tools you generate with the | 64 citations = [c for c in citations_text.split("**ENTRY**") if c.strip()] |
65 ToolFactory run just like any other Galaxy tool,but run your script every time. | 65 citation_tuples = [] |
66 | 66 for citation in citations: |
67 Tool factory tools are perfect for workflow components. One input, one output, | 67 if citation.startswith("doi"): |
68 no variables. | 68 citation_tuples.append(("doi", citation[len("doi") :].strip())) |
69 | 69 else: |
70 *To fully and safely exploit the awesome power* of this tool, | 70 citation_tuples.append(("bibtex", citation[len("bibtex") :].strip())) |
71 Galaxy and the ToolShed, you should be a developer installing this | 71 return citation_tuples |
72 tool on a private/personal/scratch local instance where you are an | 72 |
73 admin_user. Then, if you break it, you get to keep all the pieces see | 73 class Locker: |
74 https://bitbucket.org/fubar/galaxytoolfactory/wiki/Home | 74 """ |
75 | 75 multiple instances of the TF may try to update tool_conf.xml so use a simple lockfile |
76 **Installation** | 76 to prevent overwriting mix ups. |
77 This is a Galaxy tool. You can install it most conveniently using the | 77 """ |
78 administrative "Search and browse tool sheds" link. Find the Galaxy Main | 78 def __enter__ (self): |
79 toolshed at https://toolshed.g2.bx.psu.edu/ and search for the toolfactory | 79 lockfile = "/tmp/.toolfactory_lockfile.lck" |
80 repository. Open it and review the code and select the option to install it. | 80 if not os.path.exists(lockfile): |
81 | 81 try: |
82 If you can't get the tool that way, the xml and py files here need to be | 82 os.utime(lockfile, None) |
83 copied into a new tools | 83 except OSError: |
84 subdirectory such as tools/toolfactory Your tool_conf.xml needs a new entry | 84 open(lockfile, 'a').close() |
85 pointing to the xml | 85 self.fp = open(lockfile) |
86 file - something like:: | 86 fcntl.flock(self.fp.fileno(), fcntl.LOCK_EX) |
87 | 87 |
88 <section name="Tool building tools" id="toolbuilders"> | 88 def __exit__ (self, _type, value, tb): |
89 <tool file="toolfactory/rgToolFactory.xml"/> | 89 fcntl.flock(self.fp.fileno(), fcntl.LOCK_UN) |
90 </section> | 90 self.fp.close() |
91 | 91 |
92 If not already there, | 92 |
93 please add: | 93 class Tool_Conf_Updater: |
94 <datatype extension="toolshed.gz" type="galaxy.datatypes.binary:Binary" | 94 |
95 mimetype="multipart/x-gzip" subclass="True" /> | 95 """# update config/tool_conf.xml with a new tool unpacked in /tools |
96 to your local data_types_conf.xml. | 96 # requires highly insecure docker settings - like write to tool_conf.xml and to tools ! |
97 | 97 # if in a container possibly not so courageous. |
98 | 98 # Fine on your own laptop but security red flag for most production instances |
99 **Restricted execution** | 99 Note potential race condition for tool_conf.xml update - uses a file lock. |
100 | 100 """ |
101 The tool factory tool itself will then be usable ONLY by admin users - | 101 |
102 people with IDs in admin_users in universe_wsgi.ini **Yes, that's right. ONLY | 102 def __init__( |
103 admin_users can run this tool** Think about it for a moment. If allowed to | 103 self, args, tool_conf_path, new_tool_archive_path, new_tool_name, local_tool_dir, run_test |
104 run any arbitrary script on your Galaxy server, the only thing that would | 104 ): |
105 impede a miscreant bent on destroying all your Galaxy data would probably | 105 self.args = args |
106 be lack of appropriate technical skills. | 106 self.tool_conf_path = os.path.join(args.galaxy_root, tool_conf_path) |
107 | 107 self.tool_dir = os.path.join(args.galaxy_root, local_tool_dir,'TFtools') |
108 **What it does** | 108 self.out_section = "ToolFactory Generated Tools" |
109 | 109 tff = tarfile.open(new_tool_archive_path, "r:*") |
110 This is a tool factory for simple scripts in python, R and | 110 flist = tff.getnames() |
111 perl currently. Functional tests are automatically generated. How cool is that. | 111 ourdir = os.path.commonpath(flist) # eg pyrevpos |
112 | 112 self.tool_id = ourdir # they are the same for TF tools |
113 LIMITED to simple scripts that read one input from the history. Optionally can | 113 ourxml = [x for x in flist if x.lower().endswith(".xml")] |
114 write one new history dataset, and optionally collect any number of outputs | 114 tff.extractall() |
115 into links on an autogenerated HTML index page for the user to navigate - | 115 tff.close() |
116 useful if the script writes images and output files - pdf outputs are shown | 116 self.run_rsync(ourdir, self.tool_dir) |
117 as thumbnails and R's bloated pdf's are shrunk with ghostscript so that and | 117 with Locker(): |
118 imagemagik need to be available. | 118 self.update_toolconf(ourdir, ourxml) |
119 | 119 |
120 Generated tools can be edited and enhanced like any Galaxy tool, so start | 120 def run_rsync(self, srcf, dstf): |
121 small and build up since a generated script gets you a serious leg up to a | 121 src = os.path.abspath(srcf) |
122 more complex one. | 122 dst = os.path.abspath(dstf) |
123 | 123 if os.path.isdir(src): |
124 **What you do** | 124 cll = ["rsync", "-r", src, dst] |
125 | 125 else: |
126 You paste and run your script, you fix the syntax errors and | 126 cll = ["rsync", src, dst] |
127 eventually it runs. You can use the redo button and edit the script before | 127 subprocess.run( |
128 trying to rerun it as you debug - it works pretty well. | 128 cll, |
129 | 129 capture_output=False, |
130 Once the script works on some test data, you can generate a toolshed compatible | 130 encoding="utf8", |
131 gzip file containing your script ready to run as an ordinary Galaxy tool in | 131 shell=False, |
132 a repository on your local toolshed. That means safe and largely automated | 132 ) |
133 installation in any production Galaxy configured to use your toolshed. | 133 |
134 | 134 def update_toolconf(self, ourdir, ourxml): # path is relative to tools |
135 **Generated tool Security** | 135 localconf = "./local_tool_conf.xml" |
136 | 136 self.run_rsync(self.tool_conf_path, localconf) |
137 Once you install a generated tool, it's just | 137 tree = ET.parse(localconf) |
138 another tool - assuming the script is safe. They just run normally and their | 138 root = tree.getroot() |
139 user cannot do anything unusually insecure but please, practice safe toolshed. | 139 hasTF = False |
140 Read the code before you install any tool. Especially this one - it is really scary. | 140 TFsection = None |
141 | 141 for e in root.findall("section"): |
142 **Send Code** | 142 if e.attrib["name"] == self.out_section: |
143 | 143 hasTF = True |
144 Patches and suggestions welcome as bitbucket issues please? | 144 TFsection = e |
145 | 145 if not hasTF: |
146 **Attribution** | 146 TFsection = ET.Element("section", {"id":self.out_section, "name":self.out_section}) |
147 | 147 root.insert(0, TFsection) # at the top! |
148 Creating re-usable tools from scripts: The Galaxy Tool Factory | 148 our_tools = TFsection.findall("tool") |
149 Ross Lazarus; Antony Kaspi; Mark Ziemann; The Galaxy Team | 149 conf_tools = [x.attrib["file"] for x in our_tools] |
150 Bioinformatics 2012; doi: 10.1093/bioinformatics/bts573 | 150 for xml in ourxml: # may be > 1 |
151 | 151 if xml not in conf_tools: # new |
152 http://bioinformatics.oxfordjournals.org/cgi/reprint/bts573?ijkey=lczQh1sWrMwdYWJ&keytype=ref | 152 ET.SubElement(TFsection, "tool", {"file": os.path.join('TFtools', xml)}) |
153 | 153 newconf = f"{self.tool_id}_conf" |
154 **Licensing** | 154 tree.write(newconf, pretty_print=True) |
155 | 155 self.run_rsync(newconf, self.tool_conf_path) |
156 Copyright Ross Lazarus 2010 | 156 |
157 ross lazarus at g mail period com | 157 |
158 | 158 |
159 All rights reserved. | 159 |
160 | 160 class Tool_Factory: |
161 Licensed under the LGPL | 161 """Wrapper for an arbitrary script |
162 | 162 uses galaxyxml |
163 **Obligatory screenshot** | 163 |
164 | 164 """ |
165 http://bitbucket.org/fubar/galaxytoolmaker/src/fda8032fe989/images/dynamicScriptTool.png | 165 |
166 | 166 def __init__(self, args=None): # noqa |
167 """ | |
168 prepare command line cl for running the tool here | |
169 and prepare elements needed for galaxyxml tool generation | |
170 """ | |
171 self.ourcwd = os.getcwd() | |
172 self.collections = [] | |
173 if len(args.collection) > 0: | |
174 try: | |
175 self.collections = [ | |
176 json.loads(x) for x in args.collection if len(x.strip()) > 1 | |
177 ] | |
178 except Exception: | |
179 print( | |
180 f"--collections parameter {str(args.collection)} is malformed - should be a dictionary" | |
181 ) | |
182 try: | |
183 self.infiles = [ | |
184 json.loads(x) for x in args.input_files if len(x.strip()) > 1 | |
185 ] | |
186 except Exception: | |
187 print( | |
188 f"--input_files parameter {str(args.input_files)} is malformed - should be a dictionary" | |
189 ) | |
190 try: | |
191 self.outfiles = [ | |
192 json.loads(x) for x in args.output_files if len(x.strip()) > 1 | |
193 ] | |
194 except Exception: | |
195 print( | |
196 f"--output_files parameter {args.output_files} is malformed - should be a dictionary" | |
197 ) | |
198 try: | |
199 self.addpar = [ | |
200 json.loads(x) for x in args.additional_parameters if len(x.strip()) > 1 | |
201 ] | |
202 except Exception: | |
203 print( | |
204 f"--additional_parameters {args.additional_parameters} is malformed - should be a dictionary" | |
205 ) | |
206 try: | |
207 self.selpar = [ | |
208 json.loads(x) for x in args.selecttext_parameters if len(x.strip()) > 1 | |
209 ] | |
210 except Exception: | |
211 print( | |
212 f"--selecttext_parameters {args.selecttext_parameters} is malformed - should be a dictionary" | |
213 ) | |
214 self.args = args | |
215 self.cleanuppar() | |
216 self.lastxclredirect = None | |
217 self.xmlcl = [] | |
218 self.is_positional = self.args.parampass == "positional" | |
219 if self.args.sysexe: | |
220 if " " in self.args.sysexe: | |
221 self.executeme = self.args.sysexe.split(" ") | |
222 else: | |
223 self.executeme = [ | |
224 self.args.sysexe, | |
225 ] | |
226 else: | |
227 if self.args.packages: | |
228 self.executeme = [ | |
229 self.args.packages.split(",")[0].split(":")[0].strip(), | |
230 ] | |
231 else: | |
232 self.executeme = None | |
233 aXCL = self.xmlcl.append | |
234 assert args.parampass in [ | |
235 "0", | |
236 "argparse", | |
237 "positional", | |
238 ], 'args.parampass must be "0","positional" or "argparse"' | |
239 self.tool_name = re.sub("[^a-zA-Z0-9_]+", "", args.tool_name) | |
240 self.tool_id = self.tool_name | |
241 self.newtool = gxt.Tool( | |
242 self.tool_name, | |
243 self.tool_id, | |
244 self.args.tool_version, | |
245 self.args.tool_desc, | |
246 FAKEEXE, | |
247 ) | |
248 self.tooloutdir = "./tfout" | |
249 self.repdir = "./toolgen" | |
250 self.newtarpath = os.path.join(self.tooloutdir, "%s_not_tested.toolshed.gz" % self.tool_name) | |
251 self.testdir = os.path.join(self.tooloutdir, "test-data") | |
252 if not os.path.exists(self.tooloutdir): | |
253 os.mkdir(self.tooloutdir) | |
254 if not os.path.exists(self.testdir): | |
255 os.mkdir(self.testdir) | |
256 if not os.path.exists(self.repdir): | |
257 os.mkdir(self.repdir) | |
258 self.tinputs = gxtp.Inputs() | |
259 self.toutputs = gxtp.Outputs() | |
260 self.testparam = [] | |
261 if self.args.script_path: | |
262 self.prepScript() | |
263 if self.args.command_override: | |
264 scos = open(self.args.command_override, "r").readlines() | |
265 self.command_override = [x.rstrip() for x in scos] | |
266 else: | |
267 self.command_override = None | |
268 if self.args.test_override: | |
269 stos = open(self.args.test_override, "r").readlines() | |
270 self.test_override = [x.rstrip() for x in stos] | |
271 else: | |
272 self.test_override = None | |
273 if self.args.script_path: | |
274 for ex in self.executeme: | |
275 aXCL(ex) | |
276 aXCL("$runme") | |
277 else: | |
278 for ex in self.executeme: | |
279 aXCL(ex) | |
280 | |
281 if self.args.parampass == "0": | |
282 self.clsimple() | |
283 else: | |
284 if self.args.parampass == "positional": | |
285 self.prepclpos() | |
286 self.clpositional() | |
287 else: | |
288 self.prepargp() | |
289 self.clargparse() | |
290 | |
291 def clsimple(self): | |
292 """no parameters or repeats - uses < and > for i/o""" | |
293 aXCL = self.xmlcl.append | |
294 if len(self.infiles) > 0: | |
295 aXCL("<") | |
296 aXCL("$%s" % self.infiles[0]["infilename"]) | |
297 if len(self.outfiles) > 0: | |
298 aXCL(">") | |
299 aXCL("$%s" % self.outfiles[0]["name"]) | |
300 if self.args.cl_user_suffix: # DIY CL end | |
301 clp = shlex.split(self.args.cl_user_suffix) | |
302 for c in clp: | |
303 aXCL(c) | |
304 | |
305 def prepargp(self): | |
306 xclsuffix = [] | |
307 for i, p in enumerate(self.infiles): | |
308 nam = p["infilename"] | |
309 if p["origCL"].strip().upper() == "STDIN": | |
310 xappendme = [ | |
311 nam, | |
312 nam, | |
313 "< $%s" % nam, | |
314 ] | |
315 else: | |
316 rep = p["repeat"] == "1" | |
317 over = "" | |
318 if rep: | |
319 over = f'#for $rep in $R_{nam}:\n--{nam} "$rep.{nam}"\n#end for' | |
320 xappendme = [p["CL"], "$%s" % p["CL"], over] | |
321 xclsuffix.append(xappendme) | |
322 for i, p in enumerate(self.outfiles): | |
323 if p["origCL"].strip().upper() == "STDOUT": | |
324 self.lastxclredirect = [">", "$%s" % p["name"]] | |
325 else: | |
326 xclsuffix.append([p["name"], "$%s" % p["name"], ""]) | |
327 for p in self.addpar: | |
328 nam = p["name"] | |
329 rep = p["repeat"] == "1" | |
330 if rep: | |
331 over = f'#for $rep in $R_{nam}:\n--{nam} "$rep.{nam}"\n#end for' | |
332 else: | |
333 over = p["override"] | |
334 xclsuffix.append([p["CL"], '"$%s"' % nam, over]) | |
335 for p in self.selpar: | |
336 xclsuffix.append([p["CL"], '"$%s"' % p["name"], p["override"]]) | |
337 self.xclsuffix = xclsuffix | |
338 | |
339 def prepclpos(self): | |
340 xclsuffix = [] | |
341 for i, p in enumerate(self.infiles): | |
342 if p["origCL"].strip().upper() == "STDIN": | |
343 xappendme = [ | |
344 "999", | |
345 p["infilename"], | |
346 "< $%s" % p["infilename"], | |
347 ] | |
348 else: | |
349 xappendme = [p["CL"], "$%s" % p["infilename"], ""] | |
350 xclsuffix.append(xappendme) | |
351 for i, p in enumerate(self.outfiles): | |
352 if p["origCL"].strip().upper() == "STDOUT": | |
353 self.lastxclredirect = [">", "$%s" % p["name"]] | |
354 else: | |
355 xclsuffix.append([p["CL"], "$%s" % p["name"], ""]) | |
356 for p in self.addpar: | |
357 nam = p["name"] | |
358 rep = p["repeat"] == "1" # repeats make NO sense | |
359 if rep: | |
360 print( | |
361 f"### warning. Repeats for {nam} ignored - not permitted in positional parameter command lines!" | |
362 ) | |
363 over = p["override"] | |
364 xclsuffix.append([p["CL"], '"$%s"' % nam, over]) | |
365 for p in self.selpar: | |
366 xclsuffix.append([p["CL"], '"$%s"' % p["name"], p["override"]]) | |
367 xclsuffix.sort() | |
368 self.xclsuffix = xclsuffix | |
369 | |
370 def prepScript(self): | |
371 rx = open(self.args.script_path, "r").readlines() | |
372 rx = [x.rstrip() for x in rx] | |
373 rxcheck = [x.strip() for x in rx if x.strip() > ""] | |
374 assert len(rxcheck) > 0, "Supplied script is empty. Cannot run" | |
375 self.script = "\n".join(rx) | |
376 fhandle, self.sfile = tempfile.mkstemp( | |
377 prefix=self.tool_name, suffix="_%s" % (self.executeme[0]) | |
378 ) | |
379 tscript = open(self.sfile, "w") | |
380 tscript.write(self.script) | |
381 tscript.close() | |
382 self.spacedScript = [f" {x}" for x in rx if x.strip() > ""] | |
383 rx.insert(0, "#raw") | |
384 rx.append("#end raw") | |
385 self.escapedScript = rx | |
386 art = "%s.%s" % (self.tool_name, self.executeme[0]) | |
387 artifact = open(art, "wb") | |
388 artifact.write(bytes(self.script, "utf8")) | |
389 artifact.close() | |
390 | |
391 def cleanuppar(self): | |
392 """ positional parameters are complicated by their numeric ordinal""" | |
393 if self.args.parampass == "positional": | |
394 for i, p in enumerate(self.infiles): | |
395 assert ( | |
396 p["CL"].isdigit() or p["CL"].strip().upper() == "STDIN" | |
397 ), "Positional parameters must be ordinal integers - got %s for %s" % ( | |
398 p["CL"], | |
399 p["label"], | |
400 ) | |
401 for i, p in enumerate(self.outfiles): | |
402 assert ( | |
403 p["CL"].isdigit() or p["CL"].strip().upper() == "STDOUT" | |
404 ), "Positional parameters must be ordinal integers - got %s for %s" % ( | |
405 p["CL"], | |
406 p["name"], | |
407 ) | |
408 for i, p in enumerate(self.addpar): | |
409 assert p[ | |
410 "CL" | |
411 ].isdigit(), "Positional parameters must be ordinal integers - got %s for %s" % ( | |
412 p["CL"], | |
413 p["name"], | |
414 ) | |
415 for i, p in enumerate(self.infiles): | |
416 infp = copy.copy(p) | |
417 infp["origCL"] = infp["CL"] | |
418 if self.args.parampass in ["positional", "0"]: | |
419 infp["infilename"] = infp["label"].replace(" ", "_") | |
420 else: | |
421 infp["infilename"] = infp["CL"] | |
422 self.infiles[i] = infp | |
423 for i, p in enumerate(self.outfiles): | |
424 outfp = copy.copy(p) | |
425 outfp["origCL"] = outfp["CL"] # keep copy | |
426 self.outfiles[i] = outfp | |
427 for i, p in enumerate(self.addpar): | |
428 addp = copy.copy(p) | |
429 addp["origCL"] = addp["CL"] | |
430 self.addpar[i] = addp | |
431 | |
432 def clpositional(self): | |
433 # inputs in order then params | |
434 aXCL = self.xmlcl.append | |
435 for (k, v, koverride) in self.xclsuffix: | |
436 aXCL(v) | |
437 if self.lastxclredirect: | |
438 for cl in self.lastxclredirect: | |
439 aXCL(cl) | |
440 if self.args.cl_user_suffix: # DIY CL end | |
441 clp = shlex.split(self.args.cl_user_suffix) | |
442 for c in clp: | |
443 aXCL(c) | |
444 | |
445 def clargparse(self): | |
446 """argparse style""" | |
447 aXCL = self.xmlcl.append | |
448 # inputs then params in argparse named form | |
449 | |
450 for (k, v, koverride) in self.xclsuffix: | |
451 if koverride > "": | |
452 k = koverride | |
453 aXCL(k) | |
454 else: | |
455 if len(k.strip()) == 1: | |
456 k = "-%s" % k | |
457 else: | |
458 k = "--%s" % k | |
459 aXCL(k) | |
460 aXCL(v) | |
461 if self.lastxclredirect: | |
462 for cl in self.lastxclredirect: | |
463 aXCL(cl) | |
464 if self.args.cl_user_suffix: # DIY CL end | |
465 clp = shlex.split(self.args.cl_user_suffix) | |
466 for c in clp: | |
467 aXCL(c) | |
468 | |
469 def getNdash(self, newname): | |
470 if self.is_positional: | |
471 ndash = 0 | |
472 else: | |
473 ndash = 2 | |
474 if len(newname) < 2: | |
475 ndash = 1 | |
476 return ndash | |
477 | |
478 def doXMLparam(self): # noqa | |
479 """Add all needed elements to tool""" | |
480 for p in self.outfiles: | |
481 newname = p["name"] | |
482 newfmt = p["format"] | |
483 newcl = p["CL"] | |
484 test = p["test"] | |
485 oldcl = p["origCL"] | |
486 test = test.strip() | |
487 ndash = self.getNdash(newcl) | |
488 aparm = gxtp.OutputData( | |
489 name=newname, format=newfmt, num_dashes=ndash, label=newname | |
490 ) | |
491 aparm.positional = self.is_positional | |
492 if self.is_positional: | |
493 if oldcl.upper() == "STDOUT": | |
494 aparm.positional = 9999999 | |
495 aparm.command_line_override = "> $%s" % newname | |
496 else: | |
497 aparm.positional = int(oldcl) | |
498 aparm.command_line_override = "$%s" % newname | |
499 self.toutputs.append(aparm) | |
500 ld = None | |
501 if test.strip() > "": | |
502 if test.startswith("diff"): | |
503 c = "diff" | |
504 ld = 0 | |
505 if test.split(":")[1].isdigit: | |
506 ld = int(test.split(":")[1]) | |
507 tp = gxtp.TestOutput( | |
508 name=newname, | |
509 value="%s_sample" % newname, | |
510 compare=c, | |
511 lines_diff=ld, | |
512 ) | |
513 elif test.startswith("sim_size"): | |
514 c = "sim_size" | |
515 tn = test.split(":")[1].strip() | |
516 if tn > "": | |
517 if "." in tn: | |
518 delta = None | |
519 delta_frac = min(1.0, float(tn)) | |
520 else: | |
521 delta = int(tn) | |
522 delta_frac = None | |
523 tp = gxtp.TestOutput( | |
524 name=newname, | |
525 value="%s_sample" % newname, | |
526 compare=c, | |
527 delta=delta, | |
528 delta_frac=delta_frac, | |
529 ) | |
530 else: | |
531 c = test | |
532 tp = gxtp.TestOutput( | |
533 name=newname, | |
534 value="%s_sample" % newname, | |
535 compare=c, | |
536 ) | |
537 self.testparam.append(tp) | |
538 for p in self.infiles: | |
539 newname = p["infilename"] | |
540 newfmt = p["format"] | |
541 ndash = self.getNdash(newname) | |
542 reps = p.get("repeat", "0") == "1" | |
543 if not len(p["label"]) > 0: | |
544 alab = p["CL"] | |
545 else: | |
546 alab = p["label"] | |
547 aninput = gxtp.DataParam( | |
548 newname, | |
549 optional=False, | |
550 label=alab, | |
551 help=p["help"], | |
552 format=newfmt, | |
553 multiple=False, | |
554 num_dashes=ndash, | |
555 ) | |
556 aninput.positional = self.is_positional | |
557 if self.is_positional: | |
558 if p["origCL"].upper() == "STDIN": | |
559 aninput.positional = 9999998 | |
560 aninput.command_line_override = "> $%s" % newname | |
561 else: | |
562 aninput.positional = int(p["origCL"]) | |
563 aninput.command_line_override = "$%s" % newname | |
564 if reps: | |
565 repe = gxtp.Repeat( | |
566 name=f"R_{newname}", title=f"Add as many {alab} as needed" | |
567 ) | |
568 repe.append(aninput) | |
569 self.tinputs.append(repe) | |
570 tparm = gxtp.TestRepeat(name=f"R_{newname}") | |
571 tparm2 = gxtp.TestParam(newname, value="%s_sample" % newname) | |
572 tparm.append(tparm2) | |
573 self.testparam.append(tparm) | |
574 else: | |
575 self.tinputs.append(aninput) | |
576 tparm = gxtp.TestParam(newname, value="%s_sample" % newname) | |
577 self.testparam.append(tparm) | |
578 for p in self.addpar: | |
579 newname = p["name"] | |
580 newval = p["value"] | |
581 newlabel = p["label"] | |
582 newhelp = p["help"] | |
583 newtype = p["type"] | |
584 newcl = p["CL"] | |
585 oldcl = p["origCL"] | |
586 reps = p["repeat"] == "1" | |
587 if not len(newlabel) > 0: | |
588 newlabel = newname | |
589 ndash = self.getNdash(newname) | |
590 if newtype == "text": | |
591 aparm = gxtp.TextParam( | |
592 newname, | |
593 label=newlabel, | |
594 help=newhelp, | |
595 value=newval, | |
596 num_dashes=ndash, | |
597 ) | |
598 elif newtype == "integer": | |
599 aparm = gxtp.IntegerParam( | |
600 newname, | |
601 label=newlabel, | |
602 help=newhelp, | |
603 value=newval, | |
604 num_dashes=ndash, | |
605 ) | |
606 elif newtype == "float": | |
607 aparm = gxtp.FloatParam( | |
608 newname, | |
609 label=newlabel, | |
610 help=newhelp, | |
611 value=newval, | |
612 num_dashes=ndash, | |
613 ) | |
614 elif newtype == "boolean": | |
615 aparm = gxtp.BooleanParam( | |
616 newname, | |
617 label=newlabel, | |
618 help=newhelp, | |
619 value=newval, | |
620 num_dashes=ndash, | |
621 ) | |
622 else: | |
623 raise ValueError( | |
624 'Unrecognised parameter type "%s" for\ | |
625 additional parameter %s in makeXML' | |
626 % (newtype, newname) | |
627 ) | |
628 aparm.positional = self.is_positional | |
629 if self.is_positional: | |
630 aparm.positional = int(oldcl) | |
631 if reps: | |
632 repe = gxtp.Repeat( | |
633 name=f"R_{newname}", title=f"Add as many {newlabel} as needed" | |
634 ) | |
635 repe.append(aparm) | |
636 self.tinputs.append(repe) | |
637 tparm = gxtp.TestRepeat(name=f"R_{newname}") | |
638 tparm2 = gxtp.TestParam(newname, value=newval) | |
639 tparm.append(tparm2) | |
640 self.testparam.append(tparm) | |
641 else: | |
642 self.tinputs.append(aparm) | |
643 tparm = gxtp.TestParam(newname, value=newval) | |
644 self.testparam.append(tparm) | |
645 for p in self.selpar: | |
646 newname = p["name"] | |
647 newval = p["value"] | |
648 newlabel = p["label"] | |
649 newhelp = p["help"] | |
650 newtype = p["type"] | |
651 newcl = p["CL"] | |
652 if not len(newlabel) > 0: | |
653 newlabel = newname | |
654 ndash = self.getNdash(newname) | |
655 if newtype == "selecttext": | |
656 newtext = p["texts"] | |
657 aparm = gxtp.SelectParam( | |
658 newname, | |
659 label=newlabel, | |
660 help=newhelp, | |
661 num_dashes=ndash, | |
662 ) | |
663 for i in range(len(newval)): | |
664 anopt = gxtp.SelectOption( | |
665 value=newval[i], | |
666 text=newtext[i], | |
667 ) | |
668 aparm.append(anopt) | |
669 aparm.positional = self.is_positional | |
670 if self.is_positional: | |
671 aparm.positional = int(newcl) | |
672 self.tinputs.append(aparm) | |
673 tparm = gxtp.TestParam(newname, value=newval) | |
674 self.testparam.append(tparm) | |
675 else: | |
676 raise ValueError( | |
677 'Unrecognised parameter type "%s" for\ | |
678 selecttext parameter %s in makeXML' | |
679 % (newtype, newname) | |
680 ) | |
681 for p in self.collections: | |
682 newkind = p["kind"] | |
683 newname = p["name"] | |
684 newlabel = p["label"] | |
685 newdisc = p["discover"] | |
686 collect = gxtp.OutputCollection(newname, label=newlabel, type=newkind) | |
687 disc = gxtp.DiscoverDatasets( | |
688 pattern=newdisc, directory=f"{newname}", visible="false" | |
689 ) | |
690 collect.append(disc) | |
691 self.toutputs.append(collect) | |
692 try: | |
693 tparm = gxtp.TestOutputCollection(newname) # broken until PR merged. | |
694 self.testparam.append(tparm) | |
695 except Exception: | |
696 print( | |
697 "#### WARNING: Galaxyxml version does not have the PR merged yet - tests for collections must be over-ridden until then!" | |
698 ) | |
699 | |
700 def doNoXMLparam(self): | |
701 """filter style package - stdin to stdout""" | |
702 if len(self.infiles) > 0: | |
703 alab = self.infiles[0]["label"] | |
704 if len(alab) == 0: | |
705 alab = self.infiles[0]["infilename"] | |
706 max1s = ( | |
707 "Maximum one input if parampass is 0 but multiple input files supplied - %s" | |
708 % str(self.infiles) | |
709 ) | |
710 assert len(self.infiles) == 1, max1s | |
711 newname = self.infiles[0]["infilename"] | |
712 aninput = gxtp.DataParam( | |
713 newname, | |
714 optional=False, | |
715 label=alab, | |
716 help=self.infiles[0]["help"], | |
717 format=self.infiles[0]["format"], | |
718 multiple=False, | |
719 num_dashes=0, | |
720 ) | |
721 aninput.command_line_override = "< $%s" % newname | |
722 aninput.positional = True | |
723 self.tinputs.append(aninput) | |
724 tp = gxtp.TestParam(name=newname, value="%s_sample" % newname) | |
725 self.testparam.append(tp) | |
726 if len(self.outfiles) > 0: | |
727 newname = self.outfiles[0]["name"] | |
728 newfmt = self.outfiles[0]["format"] | |
729 anout = gxtp.OutputData(newname, format=newfmt, num_dashes=0) | |
730 anout.command_line_override = "> $%s" % newname | |
731 anout.positional = self.is_positional | |
732 self.toutputs.append(anout) | |
733 tp = gxtp.TestOutput(name=newname, value="%s_sample" % newname) | |
734 self.testparam.append(tp) | |
735 | |
736 def makeXML(self): # noqa | |
737 """ | |
738 Create a Galaxy xml tool wrapper for the new script | |
739 Uses galaxyhtml | |
740 Hmmm. How to get the command line into correct order... | |
741 """ | |
742 if self.command_override: | |
743 self.newtool.command_override = self.command_override # config file | |
744 else: | |
745 self.newtool.command_override = self.xmlcl | |
746 cite = gxtp.Citations() | |
747 acite = gxtp.Citation(type="doi", value="10.1093/bioinformatics/bts573") | |
748 cite.append(acite) | |
749 self.newtool.citations = cite | |
750 safertext = "" | |
751 if self.args.help_text: | |
752 helptext = open(self.args.help_text, "r").readlines() | |
753 safertext = "\n".join([cheetah_escape(x) for x in helptext]) | |
754 if len(safertext.strip()) == 0: | |
755 safertext = ( | |
756 "Ask the tool author (%s) to rebuild with help text please\n" | |
757 % (self.args.user_email) | |
758 ) | |
759 if self.args.script_path: | |
760 if len(safertext) > 0: | |
761 safertext = safertext + "\n\n------\n" # transition allowed! | |
762 scr = [x for x in self.spacedScript if x.strip() > ""] | |
763 scr.insert(0, "\n\nScript::\n") | |
764 if len(scr) > 300: | |
765 scr = ( | |
766 scr[:100] | |
767 + [" >300 lines - stuff deleted", " ......"] | |
768 + scr[-100:] | |
769 ) | |
770 scr.append("\n") | |
771 safertext = safertext + "\n".join(scr) | |
772 self.newtool.help = safertext | |
773 self.newtool.version_command = f'echo "{self.args.tool_version}"' | |
774 std = gxtp.Stdios() | |
775 std1 = gxtp.Stdio() | |
776 std.append(std1) | |
777 self.newtool.stdios = std | |
778 requirements = gxtp.Requirements() | |
779 if self.args.packages: | |
780 try: | |
781 for d in self.args.packages.split(","): | |
782 ver = None | |
783 packg = None | |
784 d = d.replace("==", ":") | |
785 d = d.replace("=", ":") | |
786 if ":" in d: | |
787 packg, ver = d.split(":") | |
788 ver = ver.strip() | |
789 packg = packg.strip() | |
790 else: | |
791 packg = d.strip() | |
792 ver = None | |
793 if ver == "": | |
794 ver = None | |
795 if packg: | |
796 requirements.append( | |
797 gxtp.Requirement("package", packg.strip(), ver) | |
798 ) | |
799 except Exception: | |
800 print( | |
801 "### malformed packages string supplied - cannot parse =", | |
802 self.args.packages, | |
803 ) | |
804 sys.exit(2) | |
805 self.newtool.requirements = requirements | |
806 if self.args.parampass == "0": | |
807 self.doNoXMLparam() | |
808 else: | |
809 self.doXMLparam() | |
810 self.newtool.outputs = self.toutputs | |
811 self.newtool.inputs = self.tinputs | |
812 if self.args.script_path: | |
813 configfiles = gxtp.Configfiles() | |
814 configfiles.append( | |
815 gxtp.Configfile(name="runme", text="\n".join(self.escapedScript)) | |
816 ) | |
817 self.newtool.configfiles = configfiles | |
818 tests = gxtp.Tests() | |
819 test_a = gxtp.Test() | |
820 for tp in self.testparam: | |
821 test_a.append(tp) | |
822 tests.append(test_a) | |
823 self.newtool.tests = tests | |
824 self.newtool.add_comment( | |
825 "Created by %s at %s using the Galaxy Tool Factory." | |
826 % (self.args.user_email, timenow()) | |
827 ) | |
828 self.newtool.add_comment("Source in git at: %s" % (toolFactoryURL)) | |
829 exml0 = self.newtool.export() | |
830 exml = exml0.replace(FAKEEXE, "") # temporary work around until PR accepted | |
831 if ( | |
832 self.test_override | |
833 ): # cannot do this inside galaxyxml as it expects lxml objects for tests | |
834 part1 = exml.split("<tests>")[0] | |
835 part2 = exml.split("</tests>")[1] | |
836 fixed = "%s\n%s\n%s" % (part1, "\n".join(self.test_override), part2) | |
837 exml = fixed | |
838 # exml = exml.replace('range="1:"', 'range="1000:"') | |
839 with open("%s.xml" % self.tool_name, "w") as xf: | |
840 xf.write(exml) | |
841 xf.write("\n") | |
842 with open(self.args.untested_tool_out, 'w') as outf: | |
843 outf.write(exml) | |
844 outf.write('\n') | |
845 # ready for the tarball | |
846 | |
847 def writeShedyml(self): | |
848 """for planemo""" | |
849 yuser = self.args.user_email.split("@")[0] | |
850 yfname = os.path.join(self.tooloutdir, ".shed.yml") | |
851 yamlf = open(yfname, "w") | |
852 odict = { | |
853 "name": self.tool_name, | |
854 "owner": yuser, | |
855 "type": "unrestricted", | |
856 "description": self.args.tool_desc, | |
857 "synopsis": self.args.tool_desc, | |
858 "category": "TF Generated Tools", | |
859 } | |
860 yaml.dump(odict, yamlf, allow_unicode=True) | |
861 yamlf.close() | |
862 | |
863 def makeTool(self): | |
864 """write xmls and input samples into place""" | |
865 if self.args.parampass == 0: | |
866 self.doNoXMLparam() | |
867 else: | |
868 self.makeXML() | |
869 if self.args.script_path: | |
870 stname = os.path.join(self.tooloutdir, self.sfile) | |
871 if not os.path.exists(stname): | |
872 shutil.copyfile(self.sfile, stname) | |
873 xreal = "%s.xml" % self.tool_name | |
874 xout = os.path.join(self.tooloutdir, xreal) | |
875 shutil.copyfile(xreal, xout) | |
876 #xout = os.path.join(self.repdir, xreal) | |
877 #shutil.copyfile(xreal, xout) | |
878 for p in self.infiles: | |
879 pth = p["name"] | |
880 dest = os.path.join(self.testdir, "%s_sample" % p["infilename"]) | |
881 shutil.copyfile(pth, dest) | |
882 dest = os.path.join( | |
883 self.repdir, "%s_sample.%s" % (p["infilename"], p["format"]) | |
884 ) | |
885 shutil.copyfile(pth, dest) | |
886 | |
887 def makeToolTar(self, report_fail=False): | |
888 """move outputs into test-data and prepare the tarball""" | |
889 excludeme = "_planemo_test_report.html" | |
890 | |
891 def exclude_function(tarinfo): | |
892 filename = tarinfo.name | |
893 return None if filename.endswith(excludeme) else tarinfo | |
894 | |
895 for p in self.outfiles: | |
896 oname = p["name"] | |
897 tdest = os.path.join(self.testdir, "%s_sample" % oname) | |
898 src = os.path.join(self.testdir, oname) | |
899 if not os.path.isfile(tdest): | |
900 if os.path.isfile(src): | |
901 shutil.copyfile(src, tdest) | |
902 dest = os.path.join(self.repdir, "%s.sample.%s" % (oname,p['format'])) | |
903 shutil.copyfile(src, dest) | |
904 else: | |
905 if report_fail: | |
906 print( | |
907 "###Tool may have failed - output file %s not found in testdir after planemo run %s." | |
908 % (tdest, self.testdir) | |
909 ) | |
910 tf = tarfile.open(self.newtarpath, "w:gz") | |
911 tf.add( | |
912 name=self.tooloutdir, | |
913 arcname=self.tool_name, | |
914 filter=exclude_function, | |
915 ) | |
916 shutil.copy(self.newtarpath, os.path.join(self.tooloutdir, f"{self.tool_name}_untested.toolshed.gz")) | |
917 tf.close() | |
918 | |
919 | |
920 def main(): | |
921 """ | |
922 This is a Galaxy wrapper. | |
923 It expects to be called by a special purpose tool.xml | |
924 | |
925 """ | |
926 parser = argparse.ArgumentParser() | |
927 a = parser.add_argument | |
928 a("--script_path", default=None) | |
929 a("--history_test", default=None) | |
930 a("--cl_user_suffix", default=None) | |
931 a("--sysexe", default=None) | |
932 a("--packages", default=None) | |
933 a("--tool_name", default="newtool") | |
934 a("--tool_dir", default=None) | |
935 a("--input_files", default=[], action="append") | |
936 a("--output_files", default=[], action="append") | |
937 a("--user_email", default="Unknown") | |
938 a("--bad_user", default=None) | |
939 a("--help_text", default=None) | |
940 a("--tool_desc", default=None) | |
941 a("--tool_version", default=None) | |
942 a("--citations", default=None) | |
943 a("--command_override", default=None) | |
944 a("--test_override", default=None) | |
945 a("--additional_parameters", action="append", default=[]) | |
946 a("--selecttext_parameters", action="append", default=[]) | |
947 a("--edit_additional_parameters", action="store_true", default=False) | |
948 a("--parampass", default="positional") | |
949 a("--tfout", default="./tfout") | |
950 a("--galaxy_root", default="/galaxy-central") | |
951 a("--galaxy_venv", default="/galaxy_venv") | |
952 a("--collection", action="append", default=[]) | |
953 a("--include_tests", default=False, action="store_true") | |
954 a("--install", default="1") | |
955 a("--admin_only", default=False, action="store_true") | |
956 a("--untested_tool_out", default=None) | |
957 a("--local_tools", default="tools") # relative to $__root_dir__ | |
958 a("--tool_conf_path", default="config/tool_conf.xml") # relative to $__root_dir__ | |
959 args = parser.parse_args() | |
960 if args.admin_only: | |
961 assert not args.bad_user, ( | |
962 'UNAUTHORISED: %s is NOT authorized to use this tool until Galaxy \ | |
963 admin adds %s to "admin_users" in the galaxy.yml Galaxy configuration file' | |
964 % (args.bad_user, args.bad_user) | |
965 ) | |
966 assert args.tool_name, "## Tool Factory expects a tool name - eg --tool_name=DESeq" | |
967 r = Tool_Factory(args) | |
968 r.writeShedyml() | |
969 r.makeTool() | |
970 r.makeToolTar() | |
971 if args.install == "1": | |
972 TCU = Tool_Conf_Updater( | |
973 args=args, | |
974 local_tool_dir=args.local_tools, | |
975 new_tool_archive_path=r.newtarpath, | |
976 tool_conf_path=args.tool_conf_path, | |
977 new_tool_name=r.tool_name, | |
978 run_test = args.run_test | |
979 ) | |
980 | |
981 if __name__ == "__main__": | |
982 main() |