Repository blockclust_workflow
Owner: rnateam
Synopsis: BlockClust Workflow
This Tool Shed Repository contains a workflow for BlockClust.
Type: unrestricted
Revision: 9:70170338b4f7
This revision can be installed: True
Times cloned / installed: 233

Repository README files - may contain important installation or license information

This package is a Galaxy workflow for BlockClust pipeline.

Galaxy

Galaxy is an open, web-based platform for data intensive research. All tools can be combined in workflows without any need of programming skills. Furthermore the platform can be extended with more tools at any time. Each tool has its own information about what it does and how the input is supposed to look like. You can make data available for Galaxy by uploading local files or downloading online content. Inputfiles, workflowsteps and results are stored in a history where you can view them or reaccess them later. It is possible to share workflows and histories with other users or make the public available. Saved workflows can be used with new input files or just to rerun an analyses which ensures repeatability.

Getting Started

BlockClust can be installed on all common Unix systems. However, it is developed on Linux and I don't have access to OS X. You are welcome to help improving this documentation, just contact me.

For any additional information, especially cluster configuration or general Galaxy questions, please have a look at the Galaxy Wiki.

Prerequisites:

* Python 2.6 or 2.7
* standard C compiler, C++ and Fortran compiler
* Autotools
* CMake
* cairo development files (used for PNG depictions)
* python development files
* Java Runtime Environment (JRE, used by OPSIN and NPLS)

To install all of the prerequisites you can run the following command, depending on your OS:

  • Debian based systems: apt-get install build-essential gfortran cmake mercurial libcairo2-dev python-dev
  • Fedora: yum install make automake gcc gcc-c++ gcc-gfortran cmake mercurial libcairo2-devel python-devel
  • OS X (MacPorts): port install gcc cmake automake mercurial cairo-devel

Galaxy installation

  1. Create a sand-boxed Python using virtualenv (not necessary but recommended):

    wget https://raw.github.com/pypa/virtualenv/master/virtualenv.py
    python ./virtualenv.py --no-site-packages galaxy_env
    . ./galaxy_env/bin/activate
    
  1. Clone the latest Galaxy platform:

    hg clone https://bitbucket.org/galaxy/galaxy-central/
    
  1. Navigate to the galaxy-central folder and update it:

    cd ~/galaxy-central
    hg pull
    hg update
    

    This step is not necessary if you have a fresh checkout. Anyway, it is good to know ;)

  2. Create folders for toolshed and dependencies:

    mkdir ~/shed_tools
    mkdir ~/galaxy-central/tool_deps
    
  3. Create configuration file:

    cp ~/galaxy-central/universe_wsgi.ini.sample ~/galaxy-central/universe_wsgi.ini
    
  4. Open universe_wsgi.ini and change the dependencies directory:

    LINUX: gedit ~/galaxy-central/universe_wsgi.ini
    OS X: open -a TextEdit ~/galaxy-central/universe_wsgi.ini
    
  5. Search for tool_dependency_dir = None and change it to tool_dependency_dir = ./tool_deps, remove the # if needed

  6. Remove the # in front of tool_config_file and tool_path

  7. (Re-)Start the galaxy daemon:

    sh run.sh --reload
    

    In deamon mode all logs will be written to main.log in your Galaxy Home directory. You can also use:

    run.sh
    

    During the first startup Galaxy will prepare your database. That can take some time. Have a look at the log file if you want to know what happens.

After launching galaxy is accessible via the browser at http://localhost:8080/.

Tool Shed configuration

  • Register a new user account in your Galaxy instance: Top Panel → User → Register

  • Become an admin
    • open universe_wsgi.ini in your favourite text editor (gedit universe_wsgi.ini)
    • search admin_users = None and change it to admin_users = EMAIL_ADDRESS (your Galaxy Username)
    • remove the # if needed
  • restart Galaxy

sh run.sh --reload

BlockClust installation

BlockClust will automatically download and compile all requirements, like EDeN, samtools and so on. It can take up to 1-2 hours.

Installation via webbrowser

  • go to the admin page
  • select Search and browse tool sheds
  • Galaxy test tool shed > Sequence Analysis > blockclust_workflow
  • install

Troubleshooting

You can navigate to the blockclust_workflow repository in your browser and repair manually: Top Panel → Admin → Manage installed tool shed repositories → blockclust_workflow → Repository Actions → Repair repository


On slow computers and during the compilation of large software libraries, like R, the Tool Shed can run into a timeout and kills the installation. That problem is known and should be fixed in the near future.

If you encouter a timeout or 'hung' during the installation you can increase the threadpool_kill_thread_limit in your universe_wsgi.ini file.


Database locking errors

Please note that Galaxy per default uses a SQLite database. Sqlite is not intended for production use. With multiple users or complex components, like that workflow, you will see database locking errors. We highly recommend to use PostgreSQL for any kind of production system.

Workflows

The BlockClust workflow is located in the Tool Shed:

http://toolshed.g2.bx.psu.edu/view/rnateam/blockclust_workflow

To import successfully installed the workflow to all your users you need to go to the admin panel, choose the worklow and import it. For more information have a look at the Galaxy wiki:

http://wiki.galaxyproject.org/ToolShedWorkflowSharing#Finding_workflows_in_tool_shed_repositories

Please note that Galaxy per default uses a SQLite database. Sqlite is not intended for production use. With multiple users or complex components, like that workflow, you will see database locking errors. We highly recommend to use PostgreSQL for any kind of production system.

Sample Data

Citation

If you use this workflow directly, or a derivative of it, or the associated wrappers for Galaxy, in work leading to a scientific publication, please cite:

Pavankumar Videm, Dominic Rose, Fabrizio Costa, and Rolf Backofen. "BlockClust: efficient clustering and classification of non-coding RNAs from short read RNA-seq profiles." Bioinformatics 30, no. 12 (2014): i274-i282.

Additional References

Availability

This workflow is available on the main Galaxy Tool Shed:

http://toolshed.g2.bx.psu.edu/view/rnateam/blockclust_workflow

Development is being done on github:

https://github.com/bgruening/galaxytools/tree/master/workflows/blockclust

Dependencies of this repository

Repository package_libpng_1_6_7 revision 3de32cd300a9 owned by devteam (prior install required)
Repository package_samtools_0_1_19 revision c9bd782f5342 owned by iuc
Repository package_r_3_0_3 revision afc48696ee5c owned by iuc
Repository package_eden_1_1 revision 8087eef83c35 owned by rnateam
Repository package_mcl_12_135 revision 9f59d1a59999 owned by iuc
Repository msa_datatypes revision 70227007b991 owned by iuc
Repository package_infernal_1_1 revision 2bac556aa73f owned by iuc
Repository package_gnu_coreutils_8_22 revision ac64dfe4b1fb owned by iuc
Repository package_freetype_2_5_2 revision 1e86d1cf79a4 owned by iuc (prior install required)
Repository package_libxml2_2_9_1 revision 45b16a3ab504 owned by iuc (prior install required)
Repository package_pixman_0_32_4 revision 93cd8e03820c owned by devteam (prior install required)
Repository package_libpng_1_6_7 revision 3de32cd300a9 owned by devteam (prior install required)
Repository package_freetype_2_5_2 revision a65217367e4a owned by devteam (prior install required)
Repository package_fontconfig_2_11_1 revision d88d844df0cb owned by iuc (prior install required)
Repository package_libpng_1_6_7 revision 3de32cd300a9 owned by devteam (prior install required)
Repository package_readline_6_2 revision 5313ca2fbce6 owned by iuc (prior install required)
Repository package_libpng_1_6_7 revision 3de32cd300a9 owned by devteam (prior install required)
Repository package_cairo_1_12_14 revision cd2ed5717e38 owned by devteam (prior install required)
Repository package_pixman_0_32_4 revision 93cd8e03820c owned by devteam (prior install required)
Repository package_freetype_2_5_2 revision a65217367e4a owned by devteam (prior install required)
Repository package_zlib_1_2_8 revision 411985b46ae8 owned by iuc (prior install required)
Repository package_zlib_1_2_8 revision 411985b46ae8 owned by iuc (prior install required)
Repository package_zlib_1_2_8 revision 411985b46ae8 owned by iuc (prior install required)
Repository package_libpng_1_6_7 revision 588666932a32 owned by iuc (prior install required)
Repository package_ncurses_5_9 revision 335ebf512407 owned by iuc (prior install required)
Repository blockbuster revision 6ae946589f04 owned by rnateam

Contents of this repository

Name steps format-version annotation
BlockClustWorkflow 8 0.1
BlockClustWorkflow_complete 8 0.1
BlockClustWorkflow_clustering 5 0.1 Until clustering step. No post processing included.

Categories
Sequence Analysis - Tools for performing Protein and DNA/RNA analysis