# HG changeset patch # User pjbriggs # Date 1539868684 14400 # Node ID 3ab198df8f3f86e81e054867cca437144fe1841d # Parent 43d6f81bc667edc08cd4ae4b599cb2ad5ef4d4aa planemo upload for repository https://github.com/pjbriggs/Amplicon_analysis-galaxy commit 15390f18b91d838880d952eb2714f689bbd8a042 diff -r 43d6f81bc667 -r 3ab198df8f3f README.rst --- a/README.rst Wed Jun 13 07:45:06 2018 -0400 +++ b/README.rst Thu Oct 18 09:18:04 2018 -0400 @@ -26,20 +26,8 @@ instance to detect the dependencies and reference data correctly at run time. -1. Install the dependencies ---------------------------- - -The ``install_tool_deps.sh`` script can be used to fetch and install the -dependencies locally, for example:: - - install_tool_deps.sh /path/to/local_tool_dependencies - -This can take some time to complete. When finished it should have -created a set of directories containing the dependencies under the -specified top level directory. - -2. Install the tool files -------------------------- +1. Install the tool from the toolshed +------------------------------------- The core tool is hosted on the Galaxy toolshed, so it can be installed directly from there (this is the recommended route): @@ -58,7 +46,7 @@ -3. Install the reference data +2. Install the reference data ----------------------------- The script ``References.sh`` from the pipeline package at @@ -72,33 +60,14 @@ will install the data in ``/path/to/pipeline/data``. **NB** The final amount of data downloaded and uncompressed will be -around 6GB. - -4. Configure dependencies and reference data in Galaxy ------------------------------------------------------- - -The final steps are to make your Galaxy installation aware of the -tool dependencies and reference data, so it can locate them both when -the tool is run. - -To target the tool dependencies installed previously, add the -following lines to the ``dependency_resolvers_conf.xml`` file in the -Galaxy ``config`` directory:: +around 9GB. - - ... - - - ... - +3. Configure reference data location in Galaxy +---------------------------------------------- -(NB it is recommended to place these *before* the ```` -resolvers) - -(If you're not familiar with dependency resolvers in Galaxy then -see the documentation at -https://docs.galaxyproject.org/en/master/admin/dependency_resolvers.html -for more details.) +The final step is to make your Galaxy installation aware of the +location of the reference data, so it can locate them both when the +tool is run. The tool locates the reference data via an environment variable called ``AMPLICON_ANALYSIS_REF_DATA_PATH``, which needs to set to the parent @@ -108,7 +77,8 @@ installation is configured: * **For local instances:** add a line to set it in the - ``config/local_env.sh`` file of your Galaxy installation, e.g.:: + ``config/local_env.sh`` file of your Galaxy installation (you + may need to create a new empty file first), e.g.:: export AMPLICON_ANALYSIS_REF_DATA_PATH=/path/to/pipeline/data @@ -124,9 +94,9 @@ (For more about job destinations see the Galaxy documentation at - https://galaxyproject.org/admin/config/jobs/#job-destinations) + https://docs.galaxyproject.org/en/master/admin/jobs.html#job-destinations) -5. Enable rendering of HTML outputs from pipeline +4. Enable rendering of HTML outputs from pipeline ------------------------------------------------- To ensure that HTML outputs are displayed correctly in Galaxy @@ -171,46 +141,32 @@ https://github.com/galaxyproject/galaxy/issues/4490 and https://github.com/galaxyproject/galaxy/issues/1676 -Appendix: availability of tool dependencies -=========================================== - -The tool takes its dependencies from the underlying pipeline script (see -https://github.com/MTutino/Amplicon_analysis/blob/master/README.md -for details). +Appendix: installing the dependencies manually +============================================== -As noted above, currently the ``install_tool_deps.sh`` script can be -used to manually install the dependencies for a local tool install. +If the tool is installed from the Galaxy toolshed (recommended) then +the dependencies should be installed automatically and this step can +be skipped. -In principle these should also be available if the tool were installed -from a toolshed. However it would be preferrable in this case to get as -many of the dependencies as possible via the ``conda`` dependency -resolver. +Otherwise the ``install_amplicon_analysis_deps.sh`` script can be used +to fetch and install the dependencies locally, for example:: -The following are known to be available via conda, with the required -version: + install_amplicon_analysis.sh /path/to/local_tool_dependencies - - cutadapt 1.8.1 - - sickle-trim 1.33 - - bioawk 1.0 - - fastqc 0.11.3 - - R 3.2.0 - -Some dependencies are available but with the "wrong" versions: +(This is the same script as is used to install dependencies from the +toolshed.) This can take some time to complete, and when completed will +have created a directory called ``Amplicon_analysis-1.2.3`` containing +the dependencies under the specified top level directory. - - spades (need 3.5.0) - - qiime (need 1.8.0) - - blast (need 2.2.26) - - vsearch (need 1.1.3) - -The following dependencies are currently unavailable: +**NB** The installed dependencies will occupy around 2.6G of disk +space. - - fasta_number (need 02jun2015) - - fasta-splitter (need 0.2.4) - - rdp_classifier (need 2.2) - - microbiomeutil (need r20110519) +You will need to make sure that the ``bin`` subdirectory of this +directory is on Galaxy's ``PATH`` at runtime, for the tool to be able +to access the dependencies - for example by adding a line to the +``local_env.sh`` file like:: -(NB usearch 6.1.544 and 8.0.1623 are special cases which must be -handled outside of Galaxy's dependency management systems.) + export PATH=/path/to/local_tool_dependencies/Amplicon_analysis-1.2.3/bin:$PATH History ======= @@ -218,6 +174,8 @@ ========== ====================================================================== Version Changes ---------- ---------------------------------------------------------------------- +1.2.3.0 Updated to Amplicon_Analysis_Pipeline version 1.2.3; install + dependencies via tool_dependencies.xml. 1.2.2.0 Updated to Amplicon_Analysis_Pipeline version 1.2.2 (removes jackknifed analysis which is not captured by Galaxy tool) 1.2.1.0 Updated to Amplicon_Analysis_Pipeline version 1.2.1 (adds diff -r 43d6f81bc667 -r 3ab198df8f3f amplicon_analysis_pipeline.py --- a/amplicon_analysis_pipeline.py Wed Jun 13 07:45:06 2018 -0400 +++ b/amplicon_analysis_pipeline.py Thu Oct 18 09:18:04 2018 -0400 @@ -60,9 +60,10 @@ sys.stderr.write("%s\n\n" % ('*'*width)) def clean_up_name(sample): - # Remove trailing "_L[0-9]+_001" from Fastq - # pair names - split_name = sample.split('_') + # Remove extensions and trailing "_L[0-9]+_001" from + # Fastq pair names + sample_name = '.'.join(sample.split('.')[:1]) + split_name = sample_name.split('_') if split_name[-1] == "001": split_name = split_name[:-1] if split_name[-1].startswith('L'): @@ -139,10 +140,12 @@ # Link to FASTQs and construct Final_name.txt file sample_names = [] + print "-- making Final_name.txt" with open("Final_name.txt",'w') as final_name: fastqs = iter(args.fastq_pairs) for sample_name,fqr1,fqr2 in zip(fastqs,fastqs,fastqs): sample_name = clean_up_name(sample_name) + print " %s" % sample_name r1 = "%s_R1_.fastq" % sample_name r2 = "%s_R2_.fastq" % sample_name os.symlink(fqr1,r1) diff -r 43d6f81bc667 -r 3ab198df8f3f amplicon_analysis_pipeline.xml --- a/amplicon_analysis_pipeline.xml Wed Jun 13 07:45:06 2018 -0400 +++ b/amplicon_analysis_pipeline.xml Thu Oct 18 09:18:04 2018 -0400 @@ -1,21 +1,7 @@ - + analyse 16S rRNA data from Illumina Miseq paired-end reads - amplicon_analysis_pipeline - cutadapt - sickle - bioawk - pandaseq - spades - fastqc - qiime - blast - fasta-splitter - rdp-classifier - R - vsearch - microbiomeutil - fasta_number + amplicon_analysis_pipeline diff -r 43d6f81bc667 -r 3ab198df8f3f install_amplicon_analysis.sh --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/install_amplicon_analysis.sh Thu Oct 18 09:18:04 2018 -0400 @@ -0,0 +1,425 @@ +#!/bin/sh -e +# +# Prototype script to setup a conda environment with the +# dependencies needed for the Amplicon_analysis_pipeline +# script +# +# Handle command line +usage() +{ + echo "Usage: $(basename $0) [DIR]" + echo "" + echo "Installs the Amplicon_analysis_pipeline package plus" + echo "dependencies in directory DIR (or current directory " + echo "if DIR not supplied)" +} +if [ ! -z "$1" ] ; then + # Check if help was requested + case "$1" in + --help|-h) + usage + exit 0 + ;; + esac + # Assume it's the installation directory + cd $1 +fi +# Versions +PIPELINE_VERSION=1.2.3 +RDP_CLASSIFIER_VERSION=2.2 +# Directories +TOP_DIR=$(pwd)/Amplicon_analysis-${PIPELINE_VERSION} +BIN_DIR=${TOP_DIR}/bin +CONDA_DIR=${TOP_DIR}/conda +CONDA_BIN=${CONDA_DIR}/bin +CONDA_LIB=${CONDA_DIR}/lib +CONDA=${CONDA_BIN}/conda +ENV_NAME="amplicon_analysis_pipeline@${PIPELINE_VERSION}" +ENV_DIR=${CONDA_DIR}/envs/$ENV_NAME +# +# Functions +# +# Report failure and terminate script +fail() +{ + echo "" + echo ERROR $@ >&2 + echo "" + echo "$(basename $0): installation failed" + exit 1 +} +# +# Rewrite the shebangs in the installed conda scripts +# to remove the full path to conda 'bin' directory +rewrite_conda_shebangs() +{ + pattern="s,^#!${CONDA_BIN}/,#!/usr/bin/env ,g" + find ${CONDA_BIN} -type f -exec sed -i "$pattern" {} \; +} +# +# Install conda +install_conda() +{ + echo "++++++++++++++++" + echo "Installing conda" + echo "++++++++++++++++" + if [ -e ${CONDA_DIR} ] ; then + echo "*** $CONDA_DIR already exists ***" >&2 + return + fi + local cwd=$(pwd) + local wd=$(mktemp -d) + cd $wd + wget -q https://repo.continuum.io/miniconda/Miniconda2-latest-Linux-x86_64.sh + bash ./Miniconda2-latest-Linux-x86_64.sh -b -p ${CONDA_DIR} + echo Installed conda in ${CONDA_DIR} + # Update the installation files + # This is to avoid problems when the length the installation + # directory path exceeds the limit for the shebang statement + # in the conda files + echo "" + echo -n "Rewriting conda shebangs..." + rewrite_conda_shebangs + echo "ok" + echo -n "Adding conda bin to PATH..." + PATH=${CONDA_BIN}:$PATH + echo "ok" + cd $cwd + rm -rf $wd/* + rmdir $wd +} +# +# Create conda environment +install_conda_packages() +{ + echo "+++++++++++++++++++++++++" + echo "Installing conda packages" + echo "+++++++++++++++++++++++++" + local cwd=$(pwd) + local wd=$(mktemp -d) + cd $wd + cat >environment.yml <${BIN_DIR}/Amplicon_analysis_pipeline.sh <${BIN_DIR}/install_reference_data.sh <${BIN_DIR}/ChimeraSlayer.pl <INSTALL.log 2>&1 + echo "ok" + cd R-3.2.1 + echo -n "Running configure..." + ./configure --prefix=$INSTALL_DIR --with-x=no --with-readline=no >>INSTALL.log 2>&1 + echo "ok" + echo -n "Running make..." + make >>INSTALL.log 2>&1 + echo "ok" + echo -n "Running make install..." + make install >>INSTALL.log 2>&1 + echo "ok" + cd $cwd + rm -rf $wd/* + rmdir $wd + . ${CONDA_BIN}/deactivate +} +setup_pipeline_environment() +{ + echo "+++++++++++++++++++++++++++++++" + echo "Setting up pipeline environment" + echo "+++++++++++++++++++++++++++++++" + # vsearch113 + echo -n "Setting up vsearch113..." + if [ -e ${BIN_DIR}/vsearch113 ] ; then + echo "already exists" + elif [ ! -e ${ENV_DIR}/bin/vsearch ] ; then + echo "failed" + fail "vsearch not found" + else + ln -s ${ENV_DIR}/bin/vsearch ${BIN_DIR}/vsearch113 + echo "ok" + fi + # fasta_splitter.pl + echo -n "Setting up fasta_splitter.pl..." + if [ -e ${BIN_DIR}/fasta-splitter.pl ] ; then + echo "already exists" + elif [ ! -e ${ENV_DIR}/share/fasta-splitter/fasta-splitter.pl ] ; then + echo "failed" + fail "fasta-splitter.pl not found" + else + ln -s ${ENV_DIR}/share/fasta-splitter/fasta-splitter.pl ${BIN_DIR}/fasta-splitter.pl + echo "ok" + fi + # rdp_classifier.jar + local rdp_classifier_jar=rdp_classifier-${RDP_CLASSIFIER_VERSION}.jar + echo -n "Setting up rdp_classifier.jar..." + if [ -e ${TOP_DIR}/share/rdp_classifier/${rdp_classifier_jar} ] ; then + echo "already exists" + elif [ ! -e ${ENV_DIR}/share/rdp_classifier/rdp_classifier.jar ] ; then + echo "failed" + fail "rdp_classifier.jar not found" + else + mkdir -p ${TOP_DIR}/share/rdp_classifier + ln -s ${ENV_DIR}/share/rdp_classifier/rdp_classifier.jar ${TOP_DIR}/share/rdp_classifier/${rdp_classifier_jar} + echo "ok" + fi + # qiime_config + echo -n "Setting up qiime_config..." + if [ -e ${TOP_DIR}/qiime/qiime_config ] ; then + echo "already exists" + else + mkdir -p ${TOP_DIR}/qiime + cat >${TOP_DIR}/qiime/qiime_config <>$INSTALL_DIR/INSTALLATION.log 2>&1 -EOF - popd - rm -rf $wd/* - rmdir $wd -} -function install_amplicon_analysis_pipeline_1_2_2() { - install_amplicon_analysis_pipeline $1 1.2.2 -} -function install_amplicon_analysis_pipeline_1_2_1() { - install_amplicon_analysis_pipeline $1 1.2.1 -} -function install_amplicon_analysis_pipeline_1_1() { - install_amplicon_analysis_pipeline $1 1.1 -} -function install_amplicon_analysis_pipeline_1_0() { - install_amplicon_analysis_pipeline $1 1.0 -} -function install_amplicon_analysis_pipeline() { - version=$2 - echo Installing Amplicon_analysis $version - install_dir=$1/amplicon_analysis_pipeline/$version - if [ -f $install_dir/env.sh ] ; then - return - fi - mkdir -p $install_dir - echo Moving to $install_dir - pushd $install_dir - wget -q https://github.com/MTutino/Amplicon_analysis/archive/v${version}.tar.gz - tar zxf v${version}.tar.gz - mv Amplicon_analysis-${version} Amplicon_analysis - rm -rf v${version}.tar.gz - popd - # Make setup file - cat > $install_dir/env.sh < $install_dir/env.sh < $INSTALL_DIR/env.sh <$INSTALL_DIR/INSTALLATION.log 2>&1 - mv sickle $INSTALL_DIR/bin - popd - rm -rf $wd/* - rmdir $wd - # Make setup file - cat > $INSTALL_DIR/env.sh <$INSTALL_DIR/INSTALLATION.log 2>&1 - mv bioawk $INSTALL_DIR/bin - mv maketab $INSTALL_DIR/bin - popd - rm -rf $wd/* - rmdir $wd - # Make setup file - cat > $INSTALL_DIR/env.sh <$install_dir/INSTALLATION.log 2>&1 - ./configure --prefix=$install_dir >>$install_dir/INSTALLATION.log 2>&1 - make; make install >>$install_dir/INSTALLATION.log 2>&1 - popd - rm -rf $wd/* - rmdir $wd - # Make setup file - cat > $1/pandaseq/2.8.1/env.sh < $1/spades/3.5.0/env.sh < $1/fastqc/0.11.3/env.sh < test.f90 - gfortran -o test test.f90 - LGF=`ldd test | grep libgfortran | awk '{print $3}'` - LGF_CANON=`readlink -f $LGF` - LGF_VERS=`objdump -p $LGF_CANON | grep GFORTRAN_1 | sed -r 's/.*GFORTRAN_1\.([0-9])+/\1/' | sort -n | tail -1` - if [ $LGF_VERS -gt $BUNDLED_LGF_VERS ]; then - cp -p $BUNDLED_LGF_CANON ${BUNDLED_LGF_CANON}.bundled - cp -p $LGF_CANON $BUNDLED_LGF_CANON - fi - popd - rm -rf $wd/* - rmdir $wd - # Atlas 3.10 (build from source) - # NB this stolen from galaxyproject/iuc-tools - ##local wd=$(mktemp -d) - ##echo Moving to $wd - ##pushd $wd - ##wget -q https://depot.galaxyproject.org/software/atlas/atlas_3.10.2+gx0_src_all.tar.bz2 - ##wget -q https://depot.galaxyproject.org/software/lapack/lapack_3.5.0_src_all.tar.gz - ##wget -q https://depot.galaxyproject.org/software/atlas/atlas_patch-blas-lapack-1.0_src_all.diff - ##wget -q https://depot.galaxyproject.org/software/atlas/atlas_patch-shared-lib-1.0_src_all.diff - ##wget -q https://depot.galaxyproject.org/software/atlas/atlas_patch-cpu-throttle-1.0_src_all.diff - ##tar -jxvf atlas_3.10.2+gx0_src_all.tar.bz2 - ##cd ATLAS - ##mkdir build - ##patch -p1 < ../atlas_patch-blas-lapack-1.0_src_all.diff - ##patch -p1 < ../atlas_patch-shared-lib-1.0_src_all.diff - ##patch -p1 < ../atlas_patch-cpu-throttle-1.0_src_all.diff - ##cd build - ##../configure --prefix="$INSTALL_DIR" -D c -DWALL -b 64 -Fa alg '-fPIC' --with-netlib-lapack-tarfile=../../lapack_3.5.0_src_all.tar.gz -v 2 -t 0 -Si cputhrchk 0 - ##make - ##make install - ##popd - ##rm -rf $wd/* - ##rmdir $wd - export ATLAS_LIB_DIR=$INSTALL_DIR/lib - export ATLAS_INCLUDE_DIR=$INSTALL_DIR/include - export ATLAS_BLAS_LIB_DIR=$INSTALL_DIR/lib/atlas - export ATLAS_LAPACK_LIB_DIR=$INSTALL_DIR/lib/atlas - export ATLAS_ROOT_PATH=$INSTALL_DIR - export LD_LIBRARY_PATH=$INSTALL_DIR/lib:$LD_LIBRARY_PATH - export LD_LIBRARY_PATH=$INSTALL_DIR/lib/atlas:$LD_LIBRARY_PATH - # Numpy 1.7.1 - local wd=$(mktemp -d) - echo Moving to $wd - pushd $wd - wget -q https://depot.galaxyproject.org/software/numpy/numpy_1.7_src_all.tar.gz - tar -zxvf numpy_1.7_src_all.tar.gz - cd numpy-1.7.1 - cat > site.cfg < $INSTALL_DIR/env.sh < $install_dir/env.sh <$install_dir/INSTALLATION.log 2>&1 - mv * $install_dir - popd - # Clean up - rm -rf $wd/* - rmdir $wd - # Make setup file -cat > $install_dir/env.sh < $install_dir/env.sh < $install_dir/env.sh <>$install_dir/INSTALLATION.log -EOF - done - # Install fasta-splitter - wget -q http://kirill-kryukov.com/study/tools/fasta-splitter/files/fasta-splitter-0.2.4.zip - unzip -qq fasta-splitter-0.2.4.zip - chmod 0755 fasta-splitter.pl - mv fasta-splitter.pl $install_dir/bin - popd - # Clean up - rm -rf $wd/* - rmdir $wd - # Make setup file -cat > $install_dir/env.sh < $install_dir/env.sh < $install_dir/env.sh <$install_dir/bin/uc2otutab.py - cat uc2otutab.py >>$install_dir/bin/uc2otutab.py - chmod +x $install_dir/bin/uc2otutab.py - popd - # Clean up - rm -rf $wd/* - rmdir $wd - # Make setup file -cat > $install_dir/env.sh < + + + + + https://raw.githubusercontent.com/pjbriggs/Amplicon_analysis-galaxy/master/install_amplicon_analysis.sh + + sh ./install_amplicon_analysis.sh $INSTALL_DIR + + + $INSTALL_DIR/Amplicon_analysis-1.2.3/bin + + + + +