9
|
1
|
|
2
|
|
3 This package is a Galaxy workflow for BlockClust pipeline.
|
|
4
|
|
5
|
|
6 ======
|
|
7 Galaxy
|
|
8 ======
|
|
9
|
|
10 `Galaxy <http://galaxyproject.org/>`_ is an open, web-based platform for data intensive research.
|
|
11 All tools can be combined in workflows without any need of programming skills.
|
|
12 Furthermore the platform can be extended with more tools at any time.
|
|
13 Each tool has its own information about what it does and how the input is supposed to look like.
|
|
14 You can make data available for Galaxy by uploading local files or downloading online content.
|
|
15 Inputfiles, workflowsteps and results are stored in a history where you can view them or reaccess them later.
|
|
16 It is possible to share workflows and histories with other users or make the public available.
|
|
17 Saved workflows can be used with new input files or just to rerun an analyses which ensures repeatability.
|
|
18
|
|
19
|
|
20
|
|
21 Getting Started
|
|
22 ===============
|
|
23
|
|
24 BlockClust can be installed on all common Unix systems.
|
|
25 However, it is developed on Linux and I don't have access to OS X. You are welcome to help improving this documentation, just contact_ me.
|
|
26
|
|
27 For any additional information, especially cluster configuration or general Galaxy_ questions,
|
|
28 please have a look at the Galaxy Wiki.
|
|
29
|
|
30 - http://wiki.galaxyproject.org/
|
|
31
|
|
32 - http://wiki.galaxyproject.org/Admin/
|
|
33
|
|
34 - http://galaxyproject.org/search/web/
|
|
35
|
|
36 .. _contact: https://github.com/bgruening
|
|
37 .. _Galaxy: http://galaxyproject.org/
|
|
38
|
|
39 Prerequisites::
|
|
40
|
|
41 * Python 2.6 or 2.7
|
|
42 * standard C compiler, C++ and Fortran compiler
|
|
43 * Autotools
|
|
44 * CMake
|
|
45 * cairo development files (used for PNG depictions)
|
|
46 * python development files
|
|
47 * Java Runtime Environment (JRE, used by OPSIN and NPLS)
|
|
48
|
|
49 To install all of the prerequisites you can run the following command, depending on your OS:
|
|
50
|
|
51 - Debian based systems: apt-get install build-essential gfortran cmake mercurial libcairo2-dev python-dev
|
|
52 - Fedora: yum install make automake gcc gcc-c++ gcc-gfortran cmake mercurial libcairo2-devel python-devel
|
|
53 - OS X (MacPorts_): port install gcc cmake automake mercurial cairo-devel
|
|
54
|
|
55 .. _MacPorts: http://www.macports.org/
|
|
56
|
|
57
|
|
58 ===================
|
|
59 Galaxy installation
|
|
60 ===================
|
|
61
|
|
62
|
|
63 0. Create a sand-boxed Python using virtualenv_ (not necessary but recommended)::
|
|
64
|
|
65 wget https://raw.github.com/pypa/virtualenv/master/virtualenv.py
|
|
66 python ./virtualenv.py --no-site-packages galaxy_env
|
|
67 . ./galaxy_env/bin/activate
|
|
68
|
|
69 .. _virtualenv: http://www.virtualenv.org/
|
|
70
|
|
71
|
|
72 1. Clone the latest `Galaxy platform`_::
|
|
73
|
|
74 hg clone https://bitbucket.org/galaxy/galaxy-central/
|
|
75
|
|
76 .. _Galaxy platform: http://wiki.galaxyproject.org/Admin/Get%20Galaxy
|
|
77
|
|
78 2. Navigate to the galaxy-central folder and update it::
|
|
79
|
|
80 cd ~/galaxy-central
|
|
81 hg pull
|
|
82 hg update
|
|
83
|
|
84 This step is not necessary if you have a fresh checkout. Anyway, it is good to know ;)
|
|
85
|
|
86 3. Create folders for toolshed and dependencies::
|
|
87
|
|
88 mkdir ~/shed_tools
|
|
89 mkdir ~/galaxy-central/tool_deps
|
|
90
|
|
91 4. Create configuration file::
|
|
92
|
|
93 cp ~/galaxy-central/universe_wsgi.ini.sample ~/galaxy-central/universe_wsgi.ini
|
|
94
|
|
95 5. Open universe_wsgi.ini and change the dependencies directory::
|
|
96
|
|
97 LINUX: gedit ~/galaxy-central/universe_wsgi.ini
|
|
98 OS X: open -a TextEdit ~/galaxy-central/universe_wsgi.ini
|
|
99
|
|
100 6. Search for ``tool_dependency_dir = None`` and change it to ``tool_dependency_dir = ./tool_deps``, remove the ``#`` if needed
|
|
101
|
|
102 7. Remove the ``#`` in front of ``tool_config_file`` and ``tool_path``
|
|
103
|
|
104 8. (Re-)Start the galaxy daemon::
|
|
105
|
|
106 sh run.sh --reload
|
|
107
|
|
108 In deamon mode all logs will be written to main.log in your Galaxy Home directory. You can also use::
|
|
109
|
|
110 run.sh
|
|
111
|
|
112 During the first startup Galaxy will prepare your database. That can take some time. Have a look at the log file if you want to know what happens.
|
|
113
|
|
114 After launching galaxy is accessible via the browser at ``http://localhost:8080/``.
|
|
115
|
|
116
|
|
117
|
|
118 =======================
|
|
119 Tool Shed configuration
|
|
120 =======================
|
|
121
|
|
122 - Register a new user account in your Galaxy instance: Top Panel → User → Register
|
|
123 - Become an admin
|
|
124 - open ``universe_wsgi.ini`` in your favourite text editor (gedit universe_wsgi.ini)
|
|
125 - search ``admin_users = None`` and change it to ``admin_users = EMAIL_ADDRESS`` (your Galaxy Username)
|
|
126 - remove the ``#`` if needed
|
|
127 - restart Galaxy
|
|
128
|
|
129 ::
|
|
130
|
|
131 sh run.sh --reload
|
|
132
|
|
133
|
|
134 =======================
|
|
135 BlockClust installation
|
|
136 =======================
|
|
137
|
|
138 BlockClust will automatically download and compile all requirements,
|
|
139 like EDeN, samtools and so on. It can take up to 1-2 hours.
|
|
140
|
|
141
|
|
142 Installation via webbrowser
|
|
143 ===========================
|
|
144
|
|
145 - go to the `admin page`_
|
|
146 - select *Search and browse tool sheds*
|
|
147 - Galaxy test tool shed > Sequence Analysis > blockclust_workflow
|
|
148 - install
|
|
149
|
|
150 .. _admin page: http://localhost:8080/admin
|
|
151
|
|
152
|
|
153 ===============
|
|
154 Troubleshooting
|
|
155 ===============
|
|
156 You can navigate to the blockclust_workflow repository in your browser and repair manually:
|
|
157 Top Panel → Admin → Manage installed tool shed repositories → blockclust_workflow → Repository Actions → Repair repository
|
|
158
|
|
159 ------
|
|
160
|
|
161
|
|
162 On slow computers and during the compilation of large software libraries, like R,
|
|
163 the Tool Shed can run into a timeout and kills the installation.
|
|
164 That problem is known and should be fixed in the near future.
|
|
165
|
|
166 If you encouter a timeout or 'hung' during the installation you can increase the ``threadpool_kill_thread_limit`` in your universe_wsgi.ini file.
|
|
167
|
|
168
|
|
169 ------
|
|
170
|
|
171 **Database locking errors**
|
|
172
|
|
173 Please note that Galaxy per default uses a SQLite database. Sqlite is not intended for production use.
|
|
174 With multiple users or complex components, like that workflow, you will see database locking errors.
|
|
175 We highly recommend to use PostgreSQL for any kind of production system.
|
|
176
|
|
177
|
|
178 .. _Galaxy wiki: http://wiki.galaxyproject.org/
|
|
179
|
|
180
|
|
181 Workflows
|
|
182 =========
|
|
183
|
|
184 The BlockClust workflow is located in the `Tool Shed`::
|
|
185
|
|
186 http://toolshed.g2.bx.psu.edu/view/rnateam/blockclust_workflow
|
|
187
|
|
188 To import successfully installed the workflow to all your users you need to go to the admin panel, choose the worklow and import it.
|
|
189 For more information have a look at the Galaxy wiki::
|
|
190
|
|
191 http://wiki.galaxyproject.org/ToolShedWorkflowSharing#Finding_workflows_in_tool_shed_repositories
|
|
192
|
|
193 Please **note** that Galaxy per default uses a SQLite database. Sqlite is not intended for production use.
|
|
194 With multiple users or complex components, like that workflow, you will see database locking errors.
|
|
195 We highly recommend to use PostgreSQL for any kind of production system.
|
|
196
|
|
197
|
|
198
|
|
199 Sample Data
|
|
200 ===========
|
|
201
|
|
202
|
|
203
|
|
204 Citation
|
|
205 ========
|
|
206
|
|
207 If you use this workflow directly, or a derivative of it, or the associated
|
|
208 wrappers for Galaxy, in work leading to a scientific publication,
|
|
209 please cite:
|
|
210
|
|
211 Pavankumar Videm, Dominic Rose, Fabrizio Costa, and Rolf Backofen. "BlockClust: efficient clustering and classification of non-coding RNAs from short read RNA-seq profiles." Bioinformatics 30, no. 12 (2014): i274-i282.
|
|
212
|
|
213
|
|
214
|
|
215 Additional References
|
|
216 =====================
|
|
217
|
|
218
|
|
219
|
|
220 Availability
|
|
221 ============
|
|
222
|
|
223 This workflow is available on the main Galaxy Tool Shed:
|
|
224
|
|
225 http://toolshed.g2.bx.psu.edu/view/rnateam/blockclust_workflow
|
|
226
|
|
227 Development is being done on github:
|
|
228
|
|
229 https://github.com/bgruening/galaxytools/tree/master/workflows/blockclust
|
|
230
|
|
231
|
|
232 Dependencies
|
|
233 ============
|
|
234
|
|
235 These dependencies should be resolved automatically via the Galaxy Tool Shed:
|
|
236
|
|
237 * http://testtoolshed.g2.bx.psu.edu/view/iuc/package_samtools_0_1_19
|
|
238 * http://testtoolshed.g2.bx.psu.edu/view/iuc/package_r_3_0_1
|
|
239 * http://testtoolshed.g2.bx.psu.edu/view/iuc/msa_datatypes
|
|
240 * http://testtoolshed.g2.bx.psu.edu/view/iuc/package_infernal_1_1rc4
|
|
241 * http://testtoolshed.g2.bx.psu.edu/view/rnateam/blockbuster
|
|
242 * http://testtoolshed.g2.bx.psu.edu/view/bgruening/package_eden_1_1
|
|
243 * http://testtoolshed.g2.bx.psu.edu/view/iuc/package_mcl_12_135
|