Galaxy |

Changeset 0:47ec9c6f44b8 (2017-11-09)

Next changeset 1:1c1902e12caf (2018-04-25)

Commit message:
planemo upload for repository https://github.com/pjbriggs/Amplicon_analysis-galaxy commit b63924933a03255872077beb4d0fde49d77afa92

added:
README.rst
amplicon_analysis_pipeline.py
amplicon_analysis_pipeline.xml
install_tool_deps.sh
static/images/Pipeline_description_Fig1.png
static/images/Pipeline_description_Fig2.png
static/images/Pipeline_description_Fig3.png

diff -r 000000000000 -r 47ec9c6f44b8 README.rst
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/README.rst Thu Nov 09 10:13:29 2017 -0500

b'@@ -0,0 +1,249 @@\n+Amplicon_analysis-galaxy\n+========================\n+\n+A Galaxy tool wrapper to Mauro Tutino\'s ``Amplicon_analysis`` pipeline\n+script at https://github.com/MTutino/Amplicon_analysis\n+\n+The pipeline can analyse paired-end 16S rRNA data from Illumina Miseq\n+(Casava >= 1.8) and performs the following operations:\n+\n+ * QC and clean up of input data\n+ * Removal of singletons and chimeras and building of OTU table\n+ and phylogenetic tree\n+ * Beta and alpha diversity of analysis\n+\n+Usage documentation\n+===================\n+\n+Usage of the tool (including required inputs) is documented within\n+the ``help`` section of the tool XML.\n+\n+Installing the tool in a Galaxy instance\n+========================================\n+\n+The following sections describe how to install the tool files,\n+dependencies and reference data, and how to configure the Galaxy\n+instance to detect the dependencies and reference data correctly\n+at run time.\n+\n+1. Install the dependencies\n+---------------------------\n+\n+The ``install_tool_deps.sh`` script can be used to fetch and install the\n+dependencies locally, for example::\n+\n+ install_tool_deps.sh /path/to/local_tool_dependencies\n+\n+This can take some time to complete. When finished it should have\n+created a set of directories containing the dependencies under the\n+specified top level directory.\n+\n+2. Install the tool files\n+-------------------------\n+\n+The core tool is hosted on the Galaxy toolshed, so it can be installed\n+directly from there (this is the recommended route):\n+\n+ * https://toolshed.g2.bx.psu.edu/view/pjbriggs/amplicon_analysis_pipeline/\n+\n+Alternatively it can be installed manually; in this case there are two\n+files to install:\n+\n+ * ``amplicon_analysis_pipeline.xml`` (the Galaxy tool definition)\n+ * ``amplicon_analysis_pipeline.py`` (the Python wrapper script)\n+\n+Put these in a directory that is visible to Galaxy (e.g. a\n+``tools/Amplicon_analysis/`` folder), and modify the ``tools_conf.xml``\n+file to tell Galaxy to offer the tool by adding the line e.g.::\n+\n+ <tool file="Amplicon_analysis/amplicon_analysis_pipeline.xml" />\n+\n+3. Install the reference data\n+-----------------------------\n+\n+The script ``References.sh`` from the pipeline package at\n+https://github.com/MTutino/Amplicon_analysis can be run to install\n+the reference data, for example::\n+\n+ cd /path/to/pipeline/data\n+ wget https://github.com/MTutino/Amplicon_analysis/raw/master/References.sh\n+ /bin/bash ./References.sh\n+\n+will install the data in ``/path/to/pipeline/data``.\n+\n+**NB** The final amount of data downloaded and uncompressed will be\n+around 6GB.\n+\n+4. Configure dependencies and reference data in Galaxy\n+------------------------------------------------------\n+\n+The final steps are to make your Galaxy installation aware of the\n+tool dependencies and reference data, so it can locate them both when\n+the tool is run.\n+\n+To target the tool dependencies installed previously, add the\n+following lines to the ``dependency_resolvers_conf.xml`` file in the\n+Galaxy ``config`` directory::\n+\n+ <dependency_resolvers>\n+ ...\n+ <galaxy_packages base_path="/path/to/local_tool_dependencies" />\n+ <galaxy_packages base_path="/path/to/local_tool_dependencies" versionless="true" />\n+ ...\n+ </dependency_resolvers>\n+\n+(NB it is recommended to place these *before* the ``<conda ... />``\n+resolvers)\n+\n+(If you\'re not familiar with dependency resolvers in Galaxy then\n+see the documentation at\n+https://docs.galaxyproject.org/en/master/admin/dependency_resolvers.html\n+for more details.)\n+\n+The tool locates the reference data via an environment variable called\n+``AMPLICON_ANALYSIS_REF_DATA_PATH``, which needs to set to the parent\n+directory where the reference data has been installed.\n+\n+There are various ways to do this, depending on how your Galaxy\n+installation is configured:\n+\n+ * **For local instances:** add a line to set it in the\n+ ``config/local_env.sh`` file of your Galaxy installation, e.g'..b' ``Amplicon_analysis`` (hint: use your browser\'s \'find-in-page\'\n+ search function to help locate it) and click on\n+ ``Submit new whitelist`` to update the settings.\n+\n+Additional details\n+==================\n+\n+Some other things to be aware of:\n+\n+ * Note that using the Silva database requires a minimum of 18Gb RAM\n+\n+Known problems\n+==============\n+\n+ * Only the ``VSEARCH`` pipeline in Mauro\'s script is currently\n+ available via the Galaxy tool; the ``USEARCH`` and ``QIIME``\n+ pipelines have yet to be implemented.\n+ * The images in the tool help section are not visible if the\n+ tool has been installed locally, or if it has been installed in\n+ a Galaxy instance which is served from a subdirectory.\n+\n+ These are both problems with Galaxy and not the tool, see\n+ https://github.com/galaxyproject/galaxy/issues/4490 and\n+ https://github.com/galaxyproject/galaxy/issues/1676\n+\n+Appendix: availability of tool dependencies\n+===========================================\n+\n+The tool takes its dependencies from the underlying pipeline script (see\n+https://github.com/MTutino/Amplicon_analysis/blob/master/README.md\n+for details).\n+\n+As noted above, currently the ``install_tool_deps.sh`` script can be\n+used to manually install the dependencies for a local tool install.\n+\n+In principle these should also be available if the tool were installed\n+from a toolshed. However it would be preferrable in this case to get as\n+many of the dependencies as possible via the ``conda`` dependency\n+resolver.\n+\n+The following are known to be available via conda, with the required\n+version:\n+\n+ - cutadapt 1.8.1\n+ - sickle-trim 1.33\n+ - bioawk 1.0\n+ - fastqc 0.11.3\n+ - R 3.2.0\n+\n+Some dependencies are available but with the "wrong" versions:\n+\n+ - spades (need 3.5.0)\n+ - qiime (need 1.8.0)\n+ - blast (need 2.2.26)\n+ - vsearch (need 1.1.3)\n+\n+The following dependencies are currently unavailable:\n+\n+ - fasta_number (need 02jun2015)\n+ - fasta-splitter (need 0.2.4)\n+ - rdp_classifier (need 2.2)\n+ - microbiomeutil (need r20110519)\n+\n+(NB usearch 6.1.544 and 8.0.1623 are special cases which must be\n+handled outside of Galaxy\'s dependency management systems.)\n+\n+History\n+=======\n+\n+========== ======================================================================\n+Version Changes\n+---------- ----------------------------------------------------------------------\n+1.1.0 First official version on Galaxy toolshed.\n+1.0.6 Expand inline documentation to provide detailed usage guidance.\n+1.0.5 Updates including:\n+\n+ - Capture read counts from quality control as new output dataset\n+ - Capture FastQC per-base quality boxplots for each sample as\n+ new output dataset\n+ - Add support for -l option (sliding window length for trimming)\n+ - Default for -L set to "200"\n+1.0.4 Various updates:\n+\n+\t - Additional outputs are captured when a "Categories" file is\n+\t supplied (alpha diversity rarefaction curves and boxplots)\n+\t - Sample names derived from Fastqs in a collection of pairs\n+\t are trimmed to SAMPLE_S* (for Illumina-style Fastq filenames)\n+ - Input Fastqs can now be of more general ``fastq`` type\n+\t - Log file outputs are captured in new output dataset\n+\t - User can specify a "title" for the job which is copied into\n+\t the dataset names (to distinguish outputs from different runs)\n+\t - Improved detection and reporting of problems with input\n+\t Metatable\n+1.0.3 Take the sample names from the collection dataset names when\n+ using collection as input (this is now the default input mode);\n+ collect additional output dataset; disable ``usearch``-based\n+ pipelines (i.e. ``UPARSE`` and ``QIIME``).\n+1.0.2 Enable support for FASTQs supplied via dataset collections and\n+ fix some broken output datasets.\n+1.0.1 Initial version\n+========== ======================================================================\n'

diff -r 000000000000 -r 47ec9c6f44b8 amplicon_analysis_pipeline.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/amplicon_analysis_pipeline.py Thu Nov 09 10:13:29 2017 -0500

[

b'@@ -0,0 +1,329 @@\n+#!/usr/bin/env python\n+#\n+# Wrapper script to run Amplicon_analysis_pipeline.sh\n+# from Galaxy tool\n+\n+import sys\n+import os\n+import argparse\n+import subprocess\n+import glob\n+\n+class PipelineCmd(object):\n+ def __init__(self,cmd):\n+ self.cmd = [str(cmd)]\n+ def add_args(self,*args):\n+ for arg in args:\n+ self.cmd.append(str(arg))\n+ def __repr__(self):\n+ return \' \'.join([str(arg) for arg in self.cmd])\n+\n+def ahref(target,name=None,type=None):\n+ if name is None:\n+ name = os.path.basename(target)\n+ ahref = "<a href=\'%s\'" % target\n+ if type is not None:\n+ ahref += " type=\'%s\'" % type\n+ ahref += ">%s</a>" % name\n+ return ahref\n+\n+def check_errors():\n+ # Errors in Amplicon_analysis_pipeline.log\n+ with open(\'Amplicon_analysis_pipeline.log\',\'r\') as pipeline_log:\n+ log = pipeline_log.read()\n+ if "Names in the first column of Metatable.txt and in the second column of Final_name.txt do not match" in log:\n+ print_error("""*** Sample IDs don\'t match dataset names ***\n+\n+The sample IDs (first column of the Metatable file) don\'t match the\n+supplied sample names for the input Fastq pairs.\n+""")\n+ # Errors in pipeline output\n+ with open(\'pipeline.log\',\'r\') as pipeline_log:\n+ log = pipeline_log.read()\n+ if "Errors and/or warnings detected in mapping file" in log:\n+ with open("Metatable_log/Metatable.log","r") as metatable_log:\n+ # Echo the Metatable log file to the tool log\n+ print_error("""*** Error in Metatable mapping file ***\n+\n+%s""" % metatable_log.read())\n+ elif "No header line was found in mapping file" in log:\n+ # Report error to the tool log\n+ print_error("""*** No header in Metatable mapping file ***\n+\n+Check you\'ve specified the correct file as the input Metatable""")\n+\n+def print_error(message):\n+ width = max([len(line) for line in message.split(\'\\n\')]) + 4\n+ sys.stderr.write("\\n%s\\n" % (\'*\'*width))\n+ for line in message.split(\'\\n\'):\n+ sys.stderr.write("* %s%s *\\n" % (line,\' \'*(width-len(line)-4)))\n+ sys.stderr.write("%s\\n\\n" % (\'*\'*width))\n+\n+def clean_up_name(sample):\n+ # Remove trailing "_L[0-9]+_001" from Fastq\n+ # pair names\n+ split_name = sample.split(\'_\')\n+ if split_name[-1] == "001":\n+ split_name = split_name[:-1]\n+ if split_name[-1].startswith(\'L\'):\n+ try:\n+ int(split_name[-1][1:])\n+ split_name = split_name[:-1]\n+ except ValueError:\n+ pass\n+ return \'_\'.join(split_name)\n+\n+def list_outputs(filen=None):\n+ # List the output directory contents\n+ # If filen is specified then will be the filename to\n+ # write to, otherwise write to stdout\n+ if filen is not None:\n+ fp = open(filen,\'w\')\n+ else:\n+ fp = sys.stdout\n+ results_dir = os.path.abspath("RESULTS")\n+ fp.write("Listing contents of output dir %s:\\n" % results_dir)\n+ ix = 0\n+ for d,dirs,files in os.walk(results_dir):\n+ ix += 1\n+ fp.write("-- %d: %s\\n" % (ix,\n+ os.path.relpath(d,results_dir)))\n+ for f in files:\n+ ix += 1\n+ fp.write("---- %d: %s\\n" % (ix,\n+ os.path.relpath(f,results_dir)))\n+ # Close output file\n+ if filen is not None:\n+ fp.close()\n+\n+if __name__ == "__main__":\n+ # Command line\n+ print "Amplicon analysis: starting"\n+ p = argparse.ArgumentParser()\n+ p.add_argument("metatable",\n+ metavar="METATABLE_FILE",\n+ help="Metatable.txt file")\n+ p.add_argument("fastq_pairs",\n+ metavar="SAMPLE_NAME FQ_R1 FQ_R2",\n+ nargs="+",\n+ default=list(),\n+ help="Triplets of SAMPLE_NAME followed by "\n+ "a R1/R2 FASTQ file pair")\n+ p.add_argument("-g",dest="forward_pcr_primer")\n+ p.add_a'..b' log:\n+ sys.stderr.write("%s" % log.read())\n+ # Write log file contents to tool log\n+ print "\\nAmplicon_analysis_pipeline.log:"\n+ with open(log_file,\'r\') as log:\n+ print "%s" % log.read()\n+ else:\n+ sys.stderr.write("ERROR missing log file \\"%s\\"\\n" %\n+ log_file)\n+\n+ # Handle FastQC boxplots\n+ print "Amplicon analysis: collating per base quality boxplots"\n+ with open("fastqc_quality_boxplots.html","w") as quality_boxplots:\n+ # PHRED value for trimming\n+ phred_score = 20\n+ if args.trimming_threshold is not None:\n+ phred_score = args.trimming_threshold\n+ # Write header for HTML output file\n+ quality_boxplots.write("""<html>\n+<head>\n+<title>Amplicon analysis pipeline: Per-base Quality Boxplots (FastQC)</title>\n+<head>\n+<body>\n+<h1>Amplicon analysis pipeline: Per-base Quality Boxplots (FastQC)</h1>\n+""")\n+ # Look for raw and trimmed FastQC output for each sample\n+ for sample_name in sample_names:\n+ fastqc_dir = os.path.join(sample_name,"FastQC")\n+ quality_boxplots.write("<h2>%s</h2>" % sample_name)\n+ for d in ("Raw","cutdapt_sickle/Q%s" % phred_score):\n+ quality_boxplots.write("<h3>%s</h3>" % d)\n+ fastqc_html_files = glob.glob(\n+ os.path.join(fastqc_dir,d,"*_fastqc.html"))\n+ if not fastqc_html_files:\n+ quality_boxplots.write("<p>No FastQC outputs found</p>")\n+ continue\n+ # Pull out the per-base quality boxplots\n+ for f in fastqc_html_files:\n+ boxplot = None\n+ with open(f) as fp:\n+ for line in fp.read().split(">"):\n+ try:\n+ line.index("alt=\\"Per base quality graph\\"")\n+ boxplot = line + ">"\n+ break\n+ except ValueError:\n+ pass\n+ if boxplot is None:\n+ boxplot = "Missing plot"\n+ quality_boxplots.write("<h4>%s</h4><p>%s</p>" %\n+ (os.path.basename(f),\n+ boxplot))\n+ quality_boxplots.write("""</body>\n+</html>\n+""")\n+\n+ # Handle additional output when categories file was supplied\n+ if args.categories_file is not None:\n+ # Alpha diversity boxplots\n+ print "Amplicon analysis: indexing alpha diversity boxplots"\n+ boxplots_dir = os.path.abspath(\n+ os.path.join("RESULTS",\n+ "%s_%s" % (args.pipeline.title(),\n+ ("gg" if not args.use_silva\n+ else "silva")),\n+ "Alpha_diversity",\n+ "Alpha_diversity_boxplot",\n+ "Categories_shannon"))\n+ print "Amplicon analysis: gathering PDFs from %s" % boxplots_dir\n+ boxplot_pdfs = [os.path.basename(pdf)\n+ for pdf in\n+ sorted(glob.glob(\n+ os.path.join(boxplots_dir,"*.pdf")))]\n+ with open("alpha_diversity_boxplots.html","w") as boxplots_out:\n+ boxplots_out.write("""<html>\n+<head>\n+<title>Amplicon analysis pipeline: Alpha Diversity Boxplots (Shannon)</title>\n+<head>\n+<body>\n+<h1>Amplicon analysis pipeline: Alpha Diversity Boxplots (Shannon)</h1>\n+""")\n+ boxplots_out.write("<ul>\\n")\n+ for pdf in boxplot_pdfs:\n+ boxplots_out.write("<li>%s</li>\\n" % ahref(pdf))\n+ boxplots_out.write("<ul>\\n")\n+ boxplots_out.write("""</body>\n+</html>\n+""")\n+\n+ # Finish\n+ print "Amplicon analysis: finishing, exit code: %s" % exit_code\n+ sys.exit(exit_code)\n'

diff -r 000000000000 -r 47ec9c6f44b8 amplicon_analysis_pipeline.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/amplicon_analysis_pipeline.xml Thu Nov 09 10:13:29 2017 -0500

[

b'@@ -0,0 +1,484 @@\n+<tool id="amplicon_analysis_pipeline" name="Amplicon Analysis Pipeline" version="1.0.6">\n+ <description>analyse 16S rRNA data from Illumina Miseq paired-end reads</description>\n+ <requirements>\n+ <requirement type="package" version="1.1">amplicon_analysis_pipeline</requirement>\n+ <requirement type="package" version="1.11">cutadapt</requirement>\n+ <requirement type="package" version="1.33">sickle</requirement>\n+ <requirement type="package" version="27-08-2013">bioawk</requirement>\n+ <requirement type="package" version="2.8.1">pandaseq</requirement>\n+ <requirement type="package" version="3.5.0">spades</requirement>\n+ <requirement type="package" version="0.11.3">fastqc</requirement>\n+ <requirement type="package" version="1.8.0">qiime</requirement>\n+ <requirement type="package" version="2.2.26">blast</requirement>\n+ <requirement type="package" version="0.2.4">fasta-splitter</requirement>\n+ <requirement type="package" version="2.2">rdp-classifier</requirement>\n+ <requirement type="package" version="3.2.0">R</requirement>\n+ <requirement type="package" version="1.1.3">vsearch</requirement>\n+ <requirement type="package" version="2010-04-29">microbiomeutil</requirement>\n+ <requirement type="package">fasta_number</requirement>\n+ </requirements>\n+ <stdio>\n+ <exit_code range="1:" />\n+ </stdio>\n+ <command><![CDATA[\n+ ## Set the reference database name\n+ #if $reference_database == ""\n+ #set reference_database_name = "gg"\n+ #else\n+ #set reference_database_name = "silva"\n+ #end if\n+\n+ ## Run the amplicon analysis pipeline wrapper\n+ python $__tool_directory__/amplicon_analysis_pipeline.py\n+ ## Set options\n+ #if str( $forward_pcr_primer ) != ""\n+ -g "$forward_pcr_primer"\n+ #end if\n+ #if str( $reverse_pcr_primer ) != ""\n+ -G "$reverse_pcr_primer"\n+ #end if\n+ #if str( $trimming_threshold ) != ""\n+ -q $trimming_threshold\n+ #end if\n+ #if str( $sliding_window_length ) != ""\n+ -l $sliding_window_length\n+ #end if\n+ #if str( $minimum_overlap ) != ""\n+ -O $minimum_overlap\n+ #end if\n+ #if str( $minimum_length ) != ""\n+ -L $minimum_length\n+ #end if\n+ -P $pipeline\n+ -r \\$AMPLICON_ANALYSIS_REF_DATA_PATH\n+ #if str( $reference_database ) != ""\n+ "${reference_database}"\n+ #end if\n+ #if str($categories_file_in) != \'None\'\n+ -c "${categories_file_in}"\n+ #end if\n+ ## Input files\n+ "${metatable_file_in}"\n+ ## FASTQ pairs\n+ #if str($input_type.pairs_or_collection) == "collection"\n+ #set fastq_pairs = $input_type.fastq_collection\n+ #else\n+ #set fastq_pairs = $input_type.fastq_pairs\n+ #end if\n+ #for $fq_pair in $fastq_pairs\n+ "${fq_pair.name}" "${fq_pair.forward}" "${fq_pair.reverse}"\n+ #end for\n+ &&\n+\n+ ## Collect outputs\n+ cp Metatable_log/Metatable_mod.txt "${metatable_mod}" &&\n+ cp ${pipeline}_OTU_tables/multiplexed_linearized_dereplicated_mc2_repset_nonchimeras_tax_OTU_table.biom "${tax_otu_table_biom_file}" &&\n+ cp ${pipeline}_OTU_tables/otus.tre "${otus_tre_file}" &&\n+ cp RESULTS/${pipeline}_${reference_database_name}/OTUs_count.txt "${otus_count_file}" &&\n+ cp RESULTS/${pipeline}_${reference_database_name}/table_summary.txt "${table_summary_file}" &&\n+ cp Multiplexed_files/${pipeline}_pipeline/multiplexed_linearized_dereplicated_mc2_repset_nonchimeras_OTUs.fasta "${dereplicated_nonchimera_otus_fasta}" &&\n+ cp QUALITY_CONTROL/Reads_count.txt "$read_counts_out" &&\n+ cp fastqc_quality_boxplots.html "${fastqc_quality_boxplots_html}" &&\n+\n+ ## HTML outputs\n+\n+ ## OTU table\n+ mkdir $heatmap_otu_table_html.files_path &&\n+ cp -r RESULTS/${pipeline}_${reference_database_name}/Heatmap/js $heatmap_otu_table_html.files_path &&\n+ cp RESULTS/${pipeline}_${reference_database_name}/Heatmap/otu_table.html "${heatmap_otu_table_html}" &&\n+\n+ ## Phylum genus barcharts\n+ mkdir $phylum_genus_dist_barcharts_html.files_path &&\n+ cp -r RESULTS/${pipeline}_${reference_database_name}/phylum_genus_charts/charts $phylum_genus_dist_barchart'..b'. Insert the PCR primer sequence\n+ in the corresponding field. DO NOT include any barcode or adapter\n+ sequence. If the PCR primers have been already trimmed by the MiSeq,\n+ and you include the sequence in this field, this would lead to an error.\n+ Only include the sequences if still present in the fastq files.\n+\n+ * **Threshold quality below which reads will be trimmed** Choose the\n+ Phred score used by Sickle to trim the reads at the 3\xe2\x80\x99 end.\n+\n+ * **Minimum length to retain a read after trimming** If the read length\n+ after trimming is shorter than a user defined length, the read, along\n+ with the corresponding read pair, will be discarded.\n+\n+ * **Minimum overlap in bp between forward and reverse reads** Choose the\n+ minimum basepair overlap used by Pandaseq to assemble the reads.\n+ Default is 10.\n+\n+ * **Minimum length in bp to keep a sequence after overlapping** Choose the\n+ minimum sequence length used by Pandaseq to keep a sequence after the\n+ overlapping. This depends on the expected amplicon length. Default is\n+ 380 (used for V3-V4 16S sequencing; expected length ~440bp)\n+\n+ * **Pipeline to use for analysis** Choose the pipeline to use for OTU\n+ clustering and chimera removal. The Galaxy tool currently supports\n+ ``Vsearch`` only. ``Uparse`` and ``QIIME`` are planned to be added\n+ shortly (the tools are already available for the stand-alone pipeline).\n+\n+ * **Reference database** Choose between ``GreenGenes`` and ``Silva``\n+ databases for taxa assignment.\n+\n+Click on **Execute** to start the analysis.\n+\n+5. Results\n+**********\n+\n+Results are entirely generated using QIIME scripts. The results will \n+appear in the History panel when the analysis is completed\n+\n+ * **Vsearch_tax_OTU_table (biom format)** The OTU table in BIOM format\n+ (http://biom-format.org/)\n+\n+ * **Vsearch_OTUs.tree** Phylogenetic tree constructed using\n+ ``make_phylogeny.py`` (fasttree) QIIME script\n+ (http://qiime.org/scripts/make_phylogeny.html)\n+\n+ * **Vsearch_phylum_genus_dist_barcharts_HTML** HTML file with bar\n+ charts at Phylum, Genus and Species level\n+ (http://qiime.org/scripts/summarize_taxa.html and\n+ http://qiime.org/scripts/plot_taxa_summary.html)\n+\n+ * **Vsearch_OTUs_count_file** Summary of OTU counts per sample\n+ (http://biom-format.org/documentation/summarizing_biom_tables.html)\n+\n+ * **Vsearch_table_summary_file** Summary of sequences counts per sample\n+ (http://biom-format.org/documentation/summarizing_biom_tables.html)\n+\n+ * **Vsearch_multiplexed_linearized_dereplicated_mc2_repset_nonchimeras_OTUs.fasta**\n+ Fasta file with OTU sequences\n+\n+ * **Vsearch_heatmap_OTU_table_HTML** Interactive OTU heatmap\n+ (http://qiime.org/1.8.0/scripts/make_otu_heatmap_html.html )\n+\n+ * **Vsearch_beta_diversity_weighted_2D_plots_HTML** PCoA plots in HTML\n+ format using weighted Unifrac distance measure. Samples are grouped\n+ by the column names present in the Metatable file. The samples are\n+ firstly rarefied to the minimum sequencing depth\n+ (http://qiime.org/scripts/beta_diversity_through_plots.html )\n+\n+ * **Vsearch_beta_diversity_unweighted_2D_plots_HTML** PCoA plots in HTML\n+ format using Unweighted Unifrac distance measure. Samples are grouped\n+ by the column names present in the Metatable file. The samples are\n+ firstly rarefied to the minimum sequencing depth\n+ (http://qiime.org/scripts/beta_diversity_through_plots.html )\n+\n+Code availability\n+-----------------\n+\n+**Code is available at** https://github.com/MTutino/Amplicon_analysis\n+\n+Credits\n+-------\n+\n+Pipeline author: Mauro Tutino\n+\n+Galaxy tool: Peter Briggs\n+\n+\t]]></help>\n+ <citations>\n+ <citation type="bibtex">\n+ @misc{githubAmplicon_analysis,\n+ author = {Tutino, Mauro},\n+ year = {2017},\n+ title = {Amplicon Analysis Pipeline},\n+ publisher = {GitHub},\n+ journal = {GitHub repository},\n+ url = {https://github.com/MTutino/Amplicon_analysis},\n+}</citation>\n+ </citations>\n+</tool>\n'

diff -r 000000000000 -r 47ec9c6f44b8 install_tool_deps.sh
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/install_tool_deps.sh Thu Nov 09 10:13:29 2017 -0500

[

b'@@ -0,0 +1,706 @@\n+#!/bin/bash -e\n+#\n+# Install the tool dependencies for Amplicon_analysis_pipeline.sh for\n+# testing from command line\n+#\n+function install_python_package() {\n+ echo Installing $2 $3 from $4 under $1\n+ local install_dir=$1\n+ local install_dirs="$install_dir $install_dir/bin $install_dir/lib/python2.7/site-packages"\n+ for d in $install_dirs ; do\n+\tif [ ! -d $d ] ; then\n+\t mkdir -p $d\n+\tfi\n+ done\n+ wd=$(mktemp -d)\n+ echo Moving to $wd\n+ pushd $wd\n+ wget -q $4\n+ if [ ! -f "$(basename $4)" ] ; then\n+\techo "No archive $(basename $4)"\n+\texit 1\n+ fi\n+ tar xzf $(basename $4)\n+ if [ ! -d "$5" ] ; then\n+\techo "No directory $5"\n+\texit 1\n+ fi\n+ cd $5\n+ /bin/bash <<EOF\n+export PYTHONPATH=$install_dir:$PYTHONPATH && \\\n+export PYTHONPATH=$install_dir/lib/python2.7/site-packages:$PYTHONPATH && \\\n+python setup.py install --prefix=$install_dir --install-scripts=$install_dir/bin --install-lib=$install_dir/lib/python2.7/site-packages >>$INSTALL_DIR/INSTALLATION.log 2>&1\n+EOF\n+ popd\n+ rm -rf $wd/*\n+ rmdir $wd\n+}\n+function install_amplicon_analysis_pipeline_1_1() {\n+ install_amplicon_analysis_pipeline $1 1.1\n+}\n+function install_amplicon_analysis_pipeline_1_0() {\n+ install_amplicon_analysis_pipeline $1 1.0\n+}\n+function install_amplicon_analysis_pipeline() {\n+ version=$2\n+ echo Installing Amplicon_analysis $version\n+ install_dir=$1/amplicon_analysis_pipeline/$version\n+ if [ -f $install_dir/env.sh ] ; then\n+\treturn\n+ fi\n+ mkdir -p $install_dir\n+ echo Moving to $install_dir\n+ pushd $install_dir\n+ wget -q https://github.com/MTutino/Amplicon_analysis/archive/v${version}.tar.gz\n+ tar zxf v${version}.tar.gz\n+ mv Amplicon_analysis-${version} Amplicon_analysis\n+ rm -rf v${version}.tar.gz\n+ popd\n+ # Make setup file\n+ cat > $install_dir/env.sh <<EOF\n+#!/bin/sh\n+# Source this to setup Amplicon_analysis/$version\n+echo Setting up Amplicon analysis pipeline $version\n+export PATH=$install_dir/Amplicon_analysis:\\$PATH\n+## AMPLICON_ANALYSIS_REF_DATA_PATH should be set in\n+## config/local_env.sh or in the job_conf.xml file\n+## - see the README\n+##export AMPLICON_ANALYSIS_REF_DATA_PATH=\n+#\n+EOF\n+}\n+function install_amplicon_analysis_pipeline_1_0_patched() {\n+ version="1.0-patched"\n+ echo Installing Amplicon_analysis $version\n+ install_dir=$1/amplicon_analysis_pipeline/$version\n+ if [ -f $install_dir/env.sh ] ; then\n+\treturn\n+ fi\n+ mkdir -p $install_dir\n+ echo Moving to $install_dir\n+ pushd $install_dir\n+ # Clone and patch analysis pipeline scripts\n+ git clone https://github.com/pjbriggs/Amplicon_analysis.git\n+ cd Amplicon_analysis\n+ git checkout -b $version\n+ branches=\n+ if [ ! -z "$branches" ] ; then\n+\tfor branch in $branches ; do\n+\t git checkout -b $branch origin/$branch\n+\t git checkout $version\n+\t git merge -m "Merge $branch into $version" $branch\n+\tdone\n+ fi\n+ cd ..\n+ popd\n+ # Make setup file\n+ cat > $install_dir/env.sh <<EOF\n+#!/bin/sh\n+# Source this to setup Amplicon_analysis/$version\n+echo Setting up Amplicon analysis pipeline $version\n+export PATH=$install_dir/Amplicon_analysis:\\$PATH\n+## AMPLICON_ANALYSIS_REF_DATA_PATH should be set in\n+## config/local_env.sh or in the job_conf.xml file\n+## - see the README\n+##export AMPLICON_ANALYSIS_REF_DATA_PATH=\n+#\n+EOF\n+}\n+function install_cutadapt_1_11() {\n+ echo Installing cutadapt 1.11\n+ INSTALL_DIR=$1/cutadapt/1.11\n+ if [ -f $INSTALL_DIR/env.sh ] ; then\n+\treturn\n+ fi\n+ mkdir -p $INSTALL_DIR\n+ install_python_package $INSTALL_DIR cutadapt 1.11 \\\n+\thttps://pypi.python.org/packages/47/bf/9045e90dac084a90aa2bb72c7d5aadefaea96a5776f445f5b5d9a7a2c78b/cutadapt-1.11.tar.gz \\\n+\tcutadapt-1.11\n+ # Make setup file\n+ cat > $INSTALL_DIR/env.sh <<EOF\n+#!/bin/sh\n+# Source this to setup cutadapt/1.11\n+echo Setting up cutadapt 1.11\n+#if [ -f $1/python/2.7.10/env.sh ] ; then\n+# . $1/python/2.7.10/env.sh\n+#fi\n+export PA'..b'tter.pl\n+ mv fasta-splitter.pl $install_dir/bin\n+ popd\n+ # Clean up\n+ rm -rf $wd/*\n+ rmdir $wd\n+ # Make setup file\n+cat > $install_dir/env.sh <<EOF\n+#!/bin/sh\n+# Source this to setup fasta-splitter/0.2.4\n+echo Setting up fasta-splitter 0.2.4\n+export PATH=$install_dir/bin:\\$PATH\n+export PERL5LIB=$install_dir/lib/perl5:\\$PERL5LIB\n+#\n+EOF\n+}\n+function install_rdp_classifier_2_2() {\n+ echo Installing rdp-classifier 2.2R\n+ local install_dir=$1/rdp-classifier/2.2\n+ if [ -f $install_dir/env.sh ] ; then\n+\treturn\n+ fi\n+ mkdir -p $install_dir\n+ local wd=$(mktemp -d)\n+ echo Moving to $wd\n+ pushd $wd\n+ wget -q https://sourceforge.net/projects/rdp-classifier/files/rdp-classifier/rdp_classifier_2.2.zip\n+ unzip -qq rdp_classifier_2.2.zip\n+ cd rdp_classifier_2.2\n+ mv * $install_dir\n+ popd\n+ # Clean up\n+ rm -rf $wd/*\n+ rmdir $wd\n+ # Make setup file\n+cat > $install_dir/env.sh <<EOF\n+#!/bin/sh\n+# Source this to setup rdp-classifier/2.2\n+echo Setting up RDP classifier 2.2\n+export RDP_JAR_PATH=$install_dir/rdp_classifier-2.2.jar\n+#\n+EOF\n+}\n+function install_R_3_2_0() {\n+ # Adapted from https://github.com/fls-bioinformatics-core/galaxy-tools/blob/master/local_dependency_installers/R.sh\n+ echo Installing R 3.2.0\n+ local install_dir=$1/R/3.2.0\n+ if [ -f $install_dir/env.sh ] ; then\n+\treturn\n+ fi\n+ mkdir -p $install_dir\n+ local wd=$(mktemp -d)\n+ echo Moving to $wd\n+ pushd $wd\n+ wget -q http://cran.r-project.org/src/base/R-3/R-3.2.0.tar.gz\n+ tar xzf R-3.2.0.tar.gz\n+ cd R-3.2.0\n+ ./configure --prefix=$install_dir\n+ make\n+ make install\n+ popd\n+ # Clean up\n+ rm -rf $wd/*\n+ rmdir $wd\n+ # Make setup file\n+cat > $install_dir/env.sh <<EOF\n+#!/bin/sh\n+# Source this to setup R/3.2.0\n+echo Setting up R 3.2.0\n+export PATH=$install_dir/bin:\\$PATH\n+export TCL_LIBRARY=$install_dir/lib/libtcl8.4.so\n+export TK_LIBRARY=$install_dir/lib/libtk8.4.so\n+#\n+EOF\n+}\n+function install_uc2otutab() {\n+ # See http://drive5.com/python/uc2otutab_py.html\n+ echo Installing uc2otutab\n+ # Install to "default" version i.e. essentially a versionless\n+ # installation (see Galaxy dependency resolver docs)\n+ local install_dir=$1/uc2otutab/default\n+ if [ -f $install_dir/env.sh ] ; then\n+\treturn\n+ fi\n+ mkdir -p $install_dir/bin\n+ local wd=$(mktemp -d)\n+ echo Moving to $wd\n+ pushd $wd\n+ wget -q http://drive5.com/python/python_scripts.tar.gz\n+ tar zxf python_scripts.tar.gz\n+ mv die.py fasta.py progress.py uc.py $install_dir/bin\n+ echo "#!/usr/bin/env python" >$install_dir/bin/uc2otutab.py\n+ cat uc2otutab.py >>$install_dir/bin/uc2otutab.py\n+ chmod +x $install_dir/bin/uc2otutab.py\n+ popd\n+ # Clean up\n+ rm -rf $wd/*\n+ rmdir $wd\n+ # Make setup file\n+cat > $install_dir/env.sh <<EOF\n+#!/bin/sh\n+# Source this to setup uc2otutab/default\n+echo Setting up uc2otutab \$default\$\n+export PATH=$install_dir/bin:\\$PATH\n+#\n+EOF\n+}\n+##########################################################\n+# Main script starts here\n+##########################################################\n+# Fetch top-level installation directory from command line\n+TOP_DIR=$1\n+if [ -z "$TOP_DIR" ] ; then\n+ echo Usage: $(basename $0) DIR\n+ exit\n+fi\n+if [ -z "$(echo $TOP_DIR | grep ^/)" ] ; then\n+ TOP_DIR=$(pwd)/$TOP_DIR\n+fi\n+if [ ! -d "$TOP_DIR" ] ; then\n+ mkdir -p $TOP_DIR\n+fi\n+# Install dependencies\n+install_amplicon_analysis_pipeline_1_1 $TOP_DIR\n+install_cutadapt_1_11 $TOP_DIR\n+install_sickle_1_33 $TOP_DIR\n+install_bioawk_27_08_2013 $TOP_DIR\n+install_pandaseq_2_8_1 $TOP_DIR\n+install_spades_3_5_0 $TOP_DIR\n+install_fastqc_0_11_3 $TOP_DIR\n+install_qiime_1_8_0 $TOP_DIR\n+install_vsearch_1_1_3 $TOP_DIR\n+install_microbiomeutil_2010_04_29 $TOP_DIR\n+install_blast_2_2_26 $TOP_DIR\n+install_fasta_number $TOP_DIR\n+install_fasta_splitter_0_2_4 $TOP_DIR\n+install_rdp_classifier_2_2 $TOP_DIR\n+install_R_3_2_0 $TOP_DIR\n+install_uc2otutab $TOP_DIR\n+##\n+#\n'

diff -r 000000000000 -r 47ec9c6f44b8 static/images/Pipeline_description_Fig1.png

Binary file static/images/Pipeline_description_Fig1.png has changed

diff -r 000000000000 -r 47ec9c6f44b8 static/images/Pipeline_description_Fig2.png

Binary file static/images/Pipeline_description_Fig2.png has changed

diff -r 000000000000 -r 47ec9c6f44b8 static/images/Pipeline_description_Fig3.png

Binary file static/images/Pipeline_description_Fig3.png has changed