annotate README.txt @ 15:dd6cf2ddaac7 draft

Uploaded
author fubar
date Wed, 28 Jan 2015 19:28:32 -0500
parents c34063ab3735
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
1 # WARNING before you start
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
2 # Install this tool on a private Galaxy ONLY
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
3 # Please NEVER on a public or production instance
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
4 # updated august 2014 by John Chilton adding citation support
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
5 #
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
6 # updated august 8 2014 to fix bugs reported by Marius van den Beek
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
7 # please cite the resource at
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
8 http://bioinformatics.oxfordjournals.org/cgi/reprint/bts573?ijkey=lczQh1sWrMwdYWJ&keytype=ref
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
9 # if you use this tool in your published work.
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
10
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
11 *Short Story*
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
12
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
13 This is an unusual Galaxy tool capable of generating new Galaxy tools.
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
14 It works by exposing *unrestricted* and therefore extremely dangerous scripting
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
15 to all designated administrators of the host Galaxy server, allowing them to
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
16 run scripts in R, python, sh and perl over multiple selected input data sets,
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
17 writing a single new data set as output.
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
18
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
19 *Differences between TF2 and the original Tool Factory*
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
20
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
21 1. TF2 (this one) allows any number of either fixed or user-editable parameters to be defined
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
22 for the new tool. If these are editable, the user can change them but otherwise, they are passed
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
23 as fixed and invisible parameters for each execution. Obviously, there are substantial security
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
24 implications with editable parameters, but these are always sanitized by Galaxy's inbuilt
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
25 parameter sanitization so you may need to "unsanitize" characters - eg translate all "__lt__"
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
26 into "<" for certain parameters where that is needed. Please practise safe toolshed.
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
27
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
28 2. Any number of (the same datatype) of input files may be defined.
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
29
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
30 These changes substantially complicate the way your supplied script is supplied with
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
31 all the new and variable parameters. Examples in each scripting language are shown
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
32 in the tool help
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
33
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
34 *Automated outputs in named sections*
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
35
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
36 If your script writes to the current directory path, arbitrary mix of (eg)
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
37 pdfs, tabular analysis results and run logs,the tool factory can optionally
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
38 auto-generate a linked Html page with separate sections showing a thumbnail
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
39 grid for all pdfs and the log text, grouping all artifacts sharing a file
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
40 name and log name prefix::
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
41
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
42 eg: if "foo.log" is emitted then *all* other outputs matching foo_* will
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
43 all be grouped together - eg
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
44 foo_baz.pdf
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
45 foo_bar.pdf and
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
46 foo_zot.xls
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
47 would all be displayed and linked in the same section with foo.log's contents
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
48 - to form the "Foo" section of the Html page. Sections appear in alphabetic
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
49 order and there are no limits on the number of files or sections.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
50
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
51 *Automated generation of new Galaxy tools for installation into any Galaxy*
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
52
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
53 Once a script is working correctly, this tool optionally generates a
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
54 new Galaxy tool, effectively freezing the supplied script into a new,
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
55 ordinary Galaxy tool that runs it over one or more input files selected by
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
56 the user. Generated tools are installed via a tool shed by an administrator
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
57 and work exactly like all other Galaxy tools for your users.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
58
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
59 If you use the Html output option, please ensure that sanitize_all_html is
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
60 set to False and uncommented in universe_wsgi.ini - it should show::
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
61
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
62 # By default, all tool output served as 'text/html' will be sanitized
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
63 sanitize_all_html = False
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
64
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
65 This opens potential security risks and may not be acceptable for public
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
66 sites where the lack of stylesheets may make Html pages damage onlookers'
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
67 eyeballs but should still be correct.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
68
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
69
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
70 *More Detail*
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
71
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
72 To use the ToolFactory, you should have prepared a script to paste into a
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
73 text box, and a small test input example ready to select from your history
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
74 to test your new script.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
75
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
76 There is an example in each scripting language on the Tool Factory form. You
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
77 can just cut and paste these to try it out - remember to select the right
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
78 interpreter please. You'll also need to create a small test data set using
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
79 the Galaxy history add new data tool.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
80
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
81 If the script fails somehow, use the "redo" button on the tool output in
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
82 your history to recreate the form complete with broken script. Fix the bug
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
83 and execute again. Rinse, wash, repeat.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
84
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
85 Once the script runs sucessfully, a new Galaxy tool that runs your script
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
86 can be generated. Select the "generate" option and supply some help text and
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
87 names. The new tool will be generated in the form of a new Galaxy datatype
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
88 - toolshed.gz - as the name suggests, it's an archive ready to upload to a
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
89 Galaxy ToolShed as a new tool repository.
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
90
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
91 Once it's in a ToolShed, it can be installed into any local Galaxy server
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
92 from the server administrative interface.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
93
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
94 Once the new tool is installed, local users can run it - each time, the script
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
95 that was supplied when it was built will be executed with the input chosen
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
96 from the user's history. In other words, the tools you generate with the
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
97 ToolFactory run just like any other Galaxy tool,but run your script every time.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
98
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
99 Tool factory tools are perfect for workflow components. One input, one output,
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
100 no variables.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
101
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
102 *To fully and safely exploit the awesome power* of this tool,
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
103 Galaxy and the ToolShed, you should be a developer installing this
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
104 tool on a private/personal/scratch local instance where you are an
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
105 admin_user. Then, if you break it, you get to keep all the pieces see
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
106 https://bitbucket.org/fubar/galaxytoolfactory/wiki/Home
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
107
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
108 ** Installation **
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
109 This is a Galaxy tool. You can install it most conveniently using the
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
110 administrative "Search and browse tool sheds" link. Find the Galaxy Main
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
111 toolshed at https://toolshed.g2.bx.psu.edu/ and search for the toolfactory
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
112 repository. Open it and review the code and select the option to install it.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
113
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
114 (
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
115 If you can't get the tool that way, the xml and py files here need to be
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
116 copied into a new tools
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
117 subdirectory such as tools/toolfactory Your tool_conf.xml needs a new entry
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
118 pointing to the xml
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
119 file - something like::
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
120
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
121 <section name="Tool building tools" id="toolbuilders">
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
122 <tool file="toolfactory/rgToolFactory.xml"/>
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
123 </section>
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
124
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
125 If not already there (I just added it to datatypes_conf.xml.sample),
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
126 please add:
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
127 <datatype extension="toolshed.gz" type="galaxy.datatypes.binary:Binary"
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
128 mimetype="multipart/x-gzip" subclass="True" />
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
129 to your local data_types_conf.xml.
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
130 )
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
131
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
132 Of course, R, python, perl etc are needed on your path if you want to test
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
133 scripts using those interpreters. Adding new ones to this tool code should
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
134 be easy enough. Please make suggestions as bitbucket issues and code. The
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
135 HTML file code automatically shrinks R's bloated pdfs, and depends on
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
136 ghostscript. The thumbnails require imagemagick .
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
137
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
138 * Restricted execution *
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
139 The tool factory tool itself will then be usable ONLY by admin users -
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
140 people with IDs in admin_users in universe_wsgi.ini **Yes, that's right. ONLY
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
141 admin_users can run this tool** Think about it for a moment. If allowed to
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
142 run any arbitrary script on your Galaxy server, the only thing that would
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
143 impede a miscreant bent on destroying all your Galaxy data would probably
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
144 be lack of appropriate technical skills.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
145
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
146 *What it does* This is a tool factory for simple scripts in python, R and
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
147 perl currently. Functional tests are automatically generated. How cool is that.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
148
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
149 LIMITED to simple scripts that read one input from the history. Optionally can
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
150 write one new history dataset, and optionally collect any number of outputs
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
151 into links on an autogenerated HTML index page for the user to navigate -
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
152 useful if the script writes images and output files - pdf outputs are shown
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
153 as thumbnails and R's bloated pdf's are shrunk with ghostscript so that and
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
154 imagemagik need to be available.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
155
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
156 Generated tools can be edited and enhanced like any Galaxy tool, so start
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
157 small and build up since a generated script gets you a serious leg up to a
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
158 more complex one.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
159
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
160 *What you do* You paste and run your script, you fix the syntax errors and
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
161 eventually it runs. You can use the redo button and edit the script before
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
162 trying to rerun it as you debug - it works pretty well.
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
163
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
164 Once the script works on some test data, you can generate a toolshed compatible
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
165 gzip file containing your script ready to run as an ordinary Galaxy tool in
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
166 a repository on your local toolshed. That means safe and largely automated
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
167 installation in any production Galaxy configured to use your toolshed.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
168
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
169 *Generated tool Security* Once you install a generated tool, it's just
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
170 another tool - assuming the script is safe. They just run normally and their
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
171 user cannot do anything unusually insecure but please, practice safe toolshed.
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
172 Read the fucking code before you install any tool. Especially this one -
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
173 it is really scary.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
174
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
175 If you opt for an HTML output, you get all the script outputs arranged
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
176 as a single Html history item - all output files are linked, thumbnails for
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
177 all the pdfs. Ugly but really inexpensive.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
178
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
179 Patches and suggestions welcome as bitbucket issues please?
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
180
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
181 copyright ross lazarus (ross stop lazarus at gmail stop com) May 2012
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
182
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
183 all rights reserved
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
184 Licensed under the LGPL if you want to improve it, feel free
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
185 https://bitbucket.org/fubar/galaxytoolfactory/wiki/Home
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
186
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
187 Material for our more enthusiastic and voracious readers continues below -
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
188 we salute you.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
189
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
190 **Motivation** Simple transformation, filtering or reporting scripts get
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
191 written, run and lost every day in most busy labs - even ours where Galaxy is
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
192 in use. This 'dark script matter' is pervasive and generally not reproducible.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
193
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
194 **Benefits** For our group, this allows Galaxy to fill that important dark
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
195 script gap - all those "small" bioinformatics tasks. Once a user has a working
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
196 R (or python or perl) script that does something Galaxy cannot currently do
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
197 (eg transpose a tabular file) and takes parameters the way Galaxy supplies
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
198 them (see example below), they:
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
199
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
200 1. Install the tool factory on a personal private instance
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
201
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
202 2. Upload a small test data set
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
203
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
204 3. Paste the script into the 'script' text box and iteratively run the
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
205 insecure tool on test data until it works right - there is absolutely no
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
206 reason to do this anywhere other than on a personal private instance.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
207
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
208 4. Once it works right, set the 'Generate toolshed gzip' option and run
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
209 it again.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
210
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
211 5. A toolshed style gzip appears ready to upload and install like any other
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
212 Toolshed entry.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
213
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
214 6. Upload the new tool to the toolshed
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
215
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
216 7. Ask the local admin to check the new tool to confirm it's not evil and
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
217 install it in the local production galaxy
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
218
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
219 **Simple examples on the tool form**
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
220
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
221 A simple Rscript "filter" showing how the command line parameters can be
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
222 handled, takes an input file, does something (transpose in this case) and
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
223 writes the results to a new tabular file::
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
224
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
225 # transpose a tabular input file and write as a tabular output file
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
226 ourargs = commandArgs(TRUE)
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
227 inf = ourargs[1]
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
228 outf = ourargs[2]
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
229 inp = read.table(inf,head=F,row.names=NULL,sep='\t')
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
230 outp = t(inp)
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
231 write.table(outp,outf, quote=FALSE, sep="\t",row.names=F,col.names=F)
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
232
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
233 Calculate a multiple test adjusted p value from a column of p values -
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
234 for this script to be useful, it needs the right column for the input to be
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
235 specified in the code for the given input file type(s) specified when the
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
236 tool is generated ::
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
237
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
238 # use p.adjust - assumes a HEADER row and column 1 - please fix for any
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
239 real use
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
240 column = 1 # adjust if necessary for some other kind of input
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
241 fdrmeth = 'BH'
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
242 ourargs = commandArgs(TRUE)
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
243 inf = ourargs[1]
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
244 outf = ourargs[2]
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
245 inp = read.table(inf,head=T,row.names=NULL,sep='\t')
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
246 p = inp[,column]
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
247 q = p.adjust(p,method=fdrmeth)
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
248 newval = paste(fdrmeth,'p-value',sep='_')
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
249 q = data.frame(q)
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
250 names(q) = newval
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
251 outp = cbind(inp,newval=q)
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
252 write.table(outp,outf, quote=FALSE, sep="\t",row.names=F,col.names=T)
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
253
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
254
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
255
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
256 Another Rscript example without any input file - generates a random heatmap
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
257 pdf - you must make sure the option to create an HTML output file is
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
258 turned on for this to work. The heatmap will be presented as a thumbnail
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
259 linked to the pdf in the resulting HTML page::
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
260
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
261 # note this script takes NO input or output because it generates random data
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
262 foo = data.frame(a=runif(100),b=runif(100),c=runif(100),d=runif(100),
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
263 e=runif(100),f=runif(100))
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
264 bar = as.matrix(foo)
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
265 pdf( "heattest.pdf" )
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
266 heatmap(bar,main='Random Heatmap')
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
267 dev.off()
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
268
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
269 A Python example that reverses each row of a tabular file. You'll need
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
270 to remove the leading spaces for this to work if cut and pasted into the
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
271 script box. Note that you can already do this in Galaxy by setting up the
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
272 cut columns tool with the correct number of columns in reverse order,but
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
273 this script will work for any number of columns so is completely generic::
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
274
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
275 # reverse order of columns in a tabular file
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
276 import sys
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
277 inp = sys.argv[1]
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
278 outp = sys.argv[2]
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
279 i = open(inp,'r')
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
280 o = open(outp,'w')
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
281 for row in i:
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
282 rs = row.rstrip().split('\t')
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
283 rs.reverse()
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
284 o.write('\t'.join(rs))
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
285 o.write('\n')
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
286 i.close()
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
287 o.close()
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
288
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
289
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
290 Galaxy as an IDE for developing API scripts
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
291 If you need to develop Galaxy API scripts and you like to live dangerously,
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
292 please read on.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
293
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
294 Galaxy as an IDE?
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
295 Amazingly enough, blend-lib API scripts run perfectly well *inside*
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
296 Galaxy when pasted into a Tool Factory form. No need to generate a new
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
297 tool. Galaxy+Tool_Factory = IDE I think we need a new t-shirt. Seriously,
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
298 it is actually quite useable.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
299
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
300 Why bother - what's wrong with Eclipse
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
301 Nothing. But, compared with developing API scripts in the usual way outside
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
302 Galaxy, you get persistence and other framework benefits plus at absolutely
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
303 no extra charge, a ginormous security problem if you share the history or
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
304 any outputs because they contain the api script with key so development
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
305 servers only please!
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
306
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
307 Workflow
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
308 Fire up the Tool Factory in Galaxy.
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
309
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
310 Leave the input box empty, set the interpreter to python, paste and run an
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
311 api script - eg working example (substitute the url and key) below.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
312
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
313 It took me a few iterations to develop the example below because I know
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
314 almost nothing about the API. I started with very simple code from one of the
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
315 samples and after each run, the (edited..) api script is conveniently recreated
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
316 using the redo button on the history output item. So each successive version
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
317 of the developing api script you run is persisted - ready to be edited and
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
318 rerun easily. It is ''very'' handy to be able to add a line of code to the
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
319 script and run it, then view the output to (eg) inspect dicts returned by
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
320 API calls to help move progressively deeper iteratively.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
321
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
322 Give the below a whirl on a private clone (install the tool factory from
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
323 the main toolshed) and try adding complexity with few rerun/edit/rerun cycles.
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
324
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
325 Eg tool factory api script
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
326 import sys
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
327 from blend.galaxy import GalaxyInstance
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
328 ourGal = 'http://x.x.x.x:xxxx'
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
329 ourKey = 'xxx'
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
330 gi = GalaxyInstance(ourGal, key=ourKey)
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
331 libs = gi.libraries.get_libraries()
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
332 res = []
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
333 # libs looks like
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
334 # u'url': u'/galaxy/api/libraries/441d8112651dc2f3', u'id':
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
335 u'441d8112651dc2f3', u'name':.... u'Demonstration sample RNA data',
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
336 for lib in libs:
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
337 res.append('%s:\n' % lib['name'])
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
338 res.append(str(gi.libraries.show_library(lib['id'],contents=True)))
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
339 outf=open(sys.argv[2],'w')
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
340 outf.write('\n'.join(res))
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
341 outf.close()
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
342
15
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
343 **Attribution**
dd6cf2ddaac7 Uploaded
fubar
parents: 0
diff changeset
344 Creating re-usable tools from scripts: The Galaxy Tool Factory
0
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
345 Ross Lazarus; Antony Kaspi; Mark Ziemann; The Galaxy Team
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
346 Bioinformatics 2012; doi: 10.1093/bioinformatics/bts573
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
347
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
348 http://bioinformatics.oxfordjournals.org/cgi/reprint/bts573?ijkey=lczQh1sWrMwdYWJ&keytype=ref
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
349
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
350 **Licensing**
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
351 Copyright Ross Lazarus 2010
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
352 ross lazarus at g mail period com
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
353
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
354 All rights reserved.
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
355
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
356 Licensed under the LGPL
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
357
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
358 **Obligatory screenshot**
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
359
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
360 http://bitbucket.org/fubar/galaxytoolmaker/src/fda8032fe989/images/dynamicScriptTool.png
c34063ab3735 Initial commit of code in iuc github repository
fubar
parents:
diff changeset
361