Galaxy |

Changeset 0:6820983ba5d5 (2014-03-18)

Next changeset 1:ebb02ba5987c (2014-03-21)

Commit message:
Uploaded

added:
COPYING
bwa_mem.py
bwa_mem.xml
readme.rst
tool-data/bwa_index.loc.sample
tool_data_table_conf.xml.sample
tool_dependencies.xml

diff -r 000000000000 -r 6820983ba5d5 COPYING
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/COPYING Tue Mar 18 07:49:22 2014 -0400

b'@@ -0,0 +1,182 @@\n+Copyright (c) 2009-2013 Pennsylvania State University\n+Copyright \xc2\xa9 2013 Yufei Luo <luoyufei@gmail.com>\n+Copyright \xc2\xa9 2014 CRS4 Srl. http://www.crs4.it/\n+Created by:\n+Kelly Vincent <kpvincent@bx.psu.edu>\n+Daniel Blankenberg <dan@bx.psu.edu>\n+Yufei Luo <luoyufei@gmail.com>\n+Nicola Soranzo <nicola.soranzo@crs4.it>\n+\n+Licensed under the Academic Free License version 3.0\n+\n+ 1) Grant of Copyright License. Licensor grants You a worldwide, royalty-free, \n+ non-exclusive, sublicensable license, for the duration of the copyright, to \n+ do the following:\n+\n+ a) to reproduce the Original Work in copies, either alone or as part of a \n+ collective work;\n+\n+ b) to translate, adapt, alter, transform, modify, or arrange the Original \n+ Work, thereby creating derivative works ("Derivative Works") based upon \n+ the Original Work;\n+\n+ c) to distribute or communicate copies of the Original Work and Derivative \n+ Works to the public, under any license of your choice that does not \n+ contradict the terms and conditions, including Licensor\'s reserved \n+ rights and remedies, in this Academic Free License;\n+\n+ d) to perform the Original Work publicly; and\n+\n+ e) to display the Original Work publicly.\n+\n+ 2) Grant of Patent License. Licensor grants You a worldwide, royalty-free, \n+ non-exclusive, sublicensable license, under patent claims owned or \n+ controlled by the Licensor that are embodied in the Original Work as \n+ furnished by the Licensor, for the duration of the patents, to make, use, \n+ sell, offer for sale, have made, and import the Original Work and \n+ Derivative Works.\n+\n+ 3) Grant of Source Code License. The term "Source Code" means the preferred \n+ form of the Original Work for making modifications to it and all available \n+ documentation describing how to modify the Original Work. Licensor agrees \n+ to provide a machine-readable copy of the Source Code of the Original Work \n+ along with each copy of the Original Work that Licensor distributes. \n+ Licensor reserves the right to satisfy this obligation by placing a \n+ machine-readable copy of the Source Code in an information repository \n+ reasonably calculated to permit inexpensive and convenient access by You \n+ for as long as Licensor continues to distribute the Original Work.\n+\n+ 4) Exclusions From License Grant. Neither the names of Licensor, nor the \n+ names of any contributors to the Original Work, nor any of their \n+ trademarks or service marks, may be used to endorse or promote products \n+ derived from this Original Work without express prior permission of the \n+ Licensor. Except as expressly stated herein, nothing in this License \n+ grants any license to Licensor\'s trademarks, copyrights, patents, trade \n+ secrets or any other intellectual property. No patent license is granted \n+ to make, use, sell, offer for sale, have made, or import embodiments of \n+ any patent claims other than the licensed claims defined in Section 2. \n+ No license is granted to the trademarks of Licensor even if such marks \n+ are included in the Original Work. Nothing in this License shall be \n+ interpreted to prohibit Licensor from licensing under terms different \n+ from this License any Original Work that Licensor otherwise would have a \n+ right to license.\n+\n+ 5) External Deployment. The term "External Deployment" means the use, \n+ distribution, or communication of the Original Work or Derivative Works \n+ in any way such that the Original Work or Derivative Works may be used by \n+ anyone other than You, whether those works are distributed or \n+ communicated to those persons or made available as an application \n+ intended for use over a network. As an express condition for the grants \n+ of license hereunder, You must treat any External Deployment by You of \n+ the Original Work or a Derivative Work as a distribution under \n+ se'..b') Termination for Patent Action. This License shall terminate \n+ automatically and You may no longer exercise any of the rights granted \n+ to You by this License as of the date You commence an action, including \n+ a cross-claim or counterclaim, against Licensor or any licensee alleging \n+ that the Original Work infringes a patent. This termination provision \n+ shall not apply for an action alleging patent infringement by \n+ combinations of the Original Work with other software or hardware.\n+\n+11) Jurisdiction, Venue and Governing Law. Any action or suit relating to \n+ this License may be brought only in the courts of a jurisdiction wherein \n+ the Licensor resides or in which Licensor conducts its primary business, \n+ and under the laws of that jurisdiction excluding its conflict-of-law \n+ provisions. The application of the United Nations Convention on \n+ Contracts for the International Sale of Goods is expressly excluded. Any \n+ use of the Original Work outside the scope of this License or after its \n+ termination shall be subject to the requirements and penalties of \n+ copyright or patent law in the appropriate jurisdiction. This section \n+ shall survive the termination of this License.\n+\n+12) Attorneys\' Fees. In any action to enforce the terms of this License or \n+ seeking damages relating thereto, the prevailing party shall be entitled \n+ to recover its costs and expenses, including, without limitation, \n+ reasonable attorneys\' fees and costs incurred in connection with such \n+ action, including any appeal of such action. This section shall survive \n+ the termination of this License.\n+\n+13) Miscellaneous. If any provision of this License is held to be \n+ unenforceable, such provision shall be reformed only to the extent \n+ necessary to make it enforceable.\n+\n+14) Definition of "You" in This License. "You" throughout this License, \n+ whether in upper or lower case, means an individual or a legal entity \n+ exercising rights under, and complying with all of the terms of, this \n+ License. For legal entities, "You" includes any entity that controls, is \n+ controlled by, or is under common control with you. For purposes of this \n+ definition, "control" means (i) the power, direct or indirect, to cause \n+ the direction or management of such entity, whether by contract or \n+ otherwise, or (ii) ownership of fifty percent (50%) or more of the \n+ outstanding shares, or (iii) beneficial ownership of such entity.\n+\n+15) Right to Use. You may use the Original Work in all ways not otherwise \n+ restricted or conditioned by this License or by law, and Licensor \n+ promises not to interfere with or be responsible for such uses by You.\n+\n+16) Modification of This License. This License is Copyright \xc2\xa9 2005 Lawrence \n+ Rosen. Permission is granted to copy, distribute, or communicate this \n+ License without modification. Nothing in this License permits You to \n+ modify this License as applied to the Original Work or to Derivative \n+ Works. However, You may modify the text of this License and copy, \n+ distribute or communicate your modified version (the "Modified \n+ License") and apply it to other original works of authorship subject to \n+ the following conditions: (i) You may not indicate in any way that your \n+ Modified License is the "Academic Free License" or "AFL" and you may not \n+ use those names in the name of your Modified License; (ii) You must \n+ replace the notice specified in the first paragraph above with the \n+ notice "Licensed under <insert your license name here>" or with a notice \n+ of your own that is not confusingly similar to the notice in this \n+ License; and (iii) You may not claim that your original works are open \n+ source software unless your Modified License has been approved by Open \n+ Source Initiative (OSI) and You comply with its license review and \n+ certification process.\n'

diff -r 000000000000 -r 6820983ba5d5 bwa_mem.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/bwa_mem.py Tue Mar 18 07:49:22 2014 -0400

[

b'@@ -0,0 +1,273 @@\n+# -*- coding: utf-8 -*-\n+#!/usr/bin/env python\n+## yufei.luo@gustave.roussy 22/07/2013\n+## Copyright \xc2\xa9 2014 CRS4 Srl. http://www.crs4.it/\n+## Modified by:\n+## Nicola Soranzo <nicola.soranzo@crs4.it>\n+\n+"""\n+Runs BWA on single-end or paired-end data.\n+Produces a SAM file containing the mappings.\n+Works with BWA version >= 0.7.5.\n+NOTICE: In this wrapper, we only use \'mem\' for mapping step.\n+\n+usage: bwa_mem.py [options]\n+\n+See below for options\n+"""\n+\n+import optparse, os, shutil, subprocess, sys, tempfile\n+\n+def stop_err( msg ):\n+ sys.stderr.write( \'%s\\n\' % msg )\n+ sys.exit()\n+\n+def check_is_double_encoded( fastq ):\n+ # check that first read is bases, not one base followed by numbers\n+ bases = [ \'A\', \'C\', \'G\', \'T\', \'a\', \'c\', \'g\', \'t\', \'N\' ]\n+ nums = [ \'0\', \'1\', \'2\', \'3\' ]\n+ for line in file( fastq, \'rb\'):\n+ if not line.strip() or line.startswith( \'@\' ):\n+ continue\n+ if len( [ b for b in line.strip() if b in nums ] ) > 0:\n+ return False\n+ elif line.strip()[0] in bases and len( [ b for b in line.strip() if b in bases ] ) == len( line.strip() ):\n+ return True\n+ else:\n+ raise Exception, \'First line in first read does not appear to be a valid FASTQ read in either base-space or color-space\'\n+ raise Exception, \'There is no non-comment and non-blank line in your FASTQ file\'\n+\n+def __main__():\n+ descr = "bwa_mem.py: Map (long length) reads against a reference genome with BWA-MEM."\n+ parser = optparse.OptionParser(description=descr)\n+ parser.add_option( \'-t\', \'--threads\', default=1, help=\'The number of threads to use [1]\' )\n+ parser.add_option( \'--ref\', help=\'The reference genome to use or index\' )\n+ parser.add_option( \'-f\', \'--fastq\', help=\'The (forward) fastq file to use for the mapping\' )\n+ parser.add_option( \'-F\', \'--rfastq\', help=\'The reverse fastq file to use for mapping if paired-end data\' )\n+ parser.add_option( \'-u\', \'--output\', help=\'The file to save the output (SAM format)\' )\n+ parser.add_option( \'-g\', \'--genAlignType\', help=\'The type of pairing (single or paired)\' )\n+ parser.add_option( \'--params\', help=\'Parameter setting to use (pre_set or full)\' )\n+ parser.add_option( \'-s\', \'--fileSource\', help=\'Whether to use a previously indexed reference sequence or one form history (indexed or history)\' )\n+ parser.add_option( \'-D\', \'--dbkey\', help=\'Dbkey for reference genome\' )\n+\n+ parser.add_option( \'-k\', \'--minEditDistSeed\', default=19, type=int, help=\'Minimum edit distance to the seed [19]\' )\n+ parser.add_option( \'-w\', \'--bandWidth\', default=100, type=int, help=\'Band width for banded alignment [100]\' )\n+ parser.add_option( \'-d\', \'--offDiagonal\', default=100, type=int, help=\'off-diagonal X-dropoff [100]\' )\n+ parser.add_option( \'-r\', \'--internalSeeds\', default=1.5, type=float, help=\'look for internal seeds inside a seed longer than {-k} * FLOAT [1.5]\' )\n+ parser.add_option( \'-c\', \'--seedsOccurrence\', default=10000, type=int, help=\'skip seeds with more than INT occurrences [10000]\' )\n+ parser.add_option( \'-S\', \'--mateRescue\', default=False, help=\'skip mate rescue\' )\n+ parser.add_option( \'-P\', \'--skipPairing\', default=False, help=\'skpe pairing, mate rescue performed unless -S also in use\' )\n+ parser.add_option( \'-A\', \'--seqMatch\', default=1, type=int, help=\'score of a sequence match\' )\n+ parser.add_option( \'-B\', \'--mismatch\', default=4, type=int, help=\'penalty for a mismatch\' )\n+ parser.add_option( \'-O\', \'--gapOpen\', default=6, type=int, help=\'gap open penalty\' )\n+ parser.add_option( \'-E\', \'--gapExtension\', default=None, help=\'gap extension penalty; a gap of size k cost {-O} + {-E}*k [1]\' )\n+ parser.add_option( \'-L\', \'--clipping\', default=5, type=int, help=\'penalty for clipping [5]\' )\n+ parser.add_option( \'-U\', \'--unpairedReadpair\', default=17, type=int, help=\'penalty for an unpaired read pair [17]\' )\n+ parser.add_option( \'-p\', \'--interPairEnd\', defaul'..b'gsm )\n+ readGroup = \'@RG\\tID:%s\\tLB:%s\\tPL:%s\\tSM:%s\' % ( options.rgid, options.rglb, options.rgpl, options.rgsm )\n+ if options.rgpu:\n+ readGroup += \'\\tPU:%s\' % options.rgpu\n+ if options.rgcn:\n+ readGroup += \'\\tCN:%s\' % options.rgcn\n+ if options.rgds:\n+ readGroup += \'\\tDS:%s\' % options.rgds\n+ if options.rgdt:\n+ readGroup += \'\\tDT:%s\' % options.rgdt\n+ if options.rgfo:\n+ readGroup += \'\\tFO:%s\' % options.rgfo\n+ if options.rgks:\n+ readGroup += \'\\tKS:%s\' % options.rgks\n+ if options.rgpg:\n+ readGroup += \'\\tPG:%s\' % options.rgpg\n+ if options.rgpi:\n+ readGroup += \'\\tPI:%s\' % options.rgpi\n+ end_cmds += \' -R "%s" \' % readGroup\n+\n+ if options.interPairEnd:\n+ end_cmds += \'-p %s \' % options.interPairEnd\n+ if options.mark:\n+ end_cmds += \'-M \'\n+\n+\n+ if options.genAlignType == \'paired\':\n+ cmd = \'bwa mem %s %s %s %s %s > %s\' % ( start_cmds, ref_file_name, fastq, rfastq, end_cmds, options.output )\n+ else:\n+ cmd = \'bwa mem %s %s %s %s > %s\' % ( start_cmds, ref_file_name, fastq, end_cmds, options.output )\n+\n+ # perform alignments\n+ buffsize = 1048576\n+ try:\n+ # need to nest try-except in try-finally to handle 2.4\n+ try:\n+ try:\n+ tmp = tempfile.NamedTemporaryFile( dir=tmp_dir ).name\n+ tmp_stderr = open( tmp, \'wb\' )\n+ print "The cmd is %s" % cmd\n+ proc = subprocess.Popen( args=cmd, shell=True, cwd=tmp_dir, stderr=tmp_stderr.fileno() )\n+ returncode = proc.wait()\n+ tmp_stderr.close()\n+ # get stderr, allowing for case where it\'s very large\n+ tmp_stderr = open( tmp, \'rb\' )\n+ stderr = \'\'\n+ try:\n+ while True:\n+ stderr += tmp_stderr.read( buffsize )\n+ if not stderr or len( stderr ) % buffsize != 0:\n+ break\n+ except OverflowError:\n+ pass\n+ tmp_stderr.close()\n+ if returncode != 0:\n+ raise Exception, stderr\n+ except Exception, e:\n+ raise Exception, \'Error generating alignments. \' + str( e )\n+ # remove header if necessary\n+ if options.suppressHeader == \'true\':\n+ tmp_out = tempfile.NamedTemporaryFile( dir=tmp_dir)\n+ tmp_out_name = tmp_out.name\n+ tmp_out.close()\n+ try:\n+ shutil.move( options.output, tmp_out_name )\n+ except Exception, e:\n+ raise Exception, \'Error moving output file before removing headers. \' + str( e )\n+ fout = file( options.output, \'w\' )\n+ for line in file( tmp_out.name, \'r\' ):\n+ if not ( line.startswith( \'@HD\' ) or line.startswith( \'@SQ\' ) or line.startswith( \'@RG\' ) or line.startswith( \'@PG\' ) or line.startswith( \'@CO\' ) ):\n+ fout.write( line )\n+ fout.close()\n+ # check that there are results in the output file\n+ if os.path.getsize( options.output ) > 0:\n+ sys.stdout.write( \'BWA run on %s-end data\' % options.genAlignType )\n+ else:\n+ raise Exception, \'The output file is empty. You may simply have no matches, or there may be an error with your input file or settings.\'\n+ except Exception, e:\n+ stop_err( \'The alignment failed.\\n\' + str( e ) )\n+ finally:\n+ # clean up temp dir\n+ if os.path.exists( tmp_index_dir ):\n+ shutil.rmtree( tmp_index_dir )\n+ if os.path.exists( tmp_dir ):\n+ shutil.rmtree( tmp_dir )\n+\n+if __name__ == "__main__":\n+ __main__()\n'

diff -r 000000000000 -r 6820983ba5d5 bwa_mem.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/bwa_mem.xml Tue Mar 18 07:49:22 2014 -0400

[

b'@@ -0,0 +1,208 @@\n+<tool id="bwa_mem" name="Map with BWA-MEM" version="0.7.7">\n+ <requirements>\n+ <requirement type="package" version="0.7.7">bwa</requirement>\n+ </requirements>\n+ <description></description>\n+ <parallelism method="basic"></parallelism>\n+ <version_command>bwa 2>&1 | grep "Version: " | sed -e \'s/Version: //\'</version_command>\n+ <command interpreter="python">\n+ bwa_mem.py\n+ --threads="\\${GALAXY_SLOTS:-1}"\n+ --fileSource="${genomeSource.refGenomeSource}"\n+ #if $genomeSource.refGenomeSource == "history":\n+ ##build index on the fly\n+ --ref="${genomeSource.ownFile}"\n+ --dbkey="${dbkey}"\n+ #else:\n+ ##use precomputed indexes\n+ --ref="${genomeSource.indices.fields.path}"\n+ #end if\n+\n+ ## input file(s)\n+ --fastq="${paired.fastq}"\n+ #if $paired.sPaired == "paired":\n+ --rfastq="${paired.rfastq}"\n+ #end if\n+\n+ ## output file\n+ --output="${output}"\n+\n+ ## run parameters\n+ --genAlignType="${paired.sPaired}"\n+ --params="${params.source_select}"\n+ #if $params.source_select != "pre_set":\n+ --minEditDistSeed="${params.minEditDistSeed}"\n+ --bandWidth="${params.bandWidth}"\n+ --offDiagonal="${params.offDiagonal}"\n+ --internalSeeds="${params.internalSeeds}"\n+ --seedsOccurrence="${params.seedsOccurrence}"\n+ --mateRescue="${params.mateRescue}"\n+ --skipPairing="${params.skipPairing}"\n+ --seqMatch="${params.seqMatch}"\n+ --mismatch="${params.mismatch}"\n+ --gapOpen="${params.gapOpen}"\n+ --gapExtension="${params.gapExtension}"\n+ --clipping="${params.clipping}"\n+ --unpairedReadpair="${params.unpairedReadpair}"\n+ --interPairEnd="${params.interPairEnd}"\n+ --minScore="${params.minScore}"\n+ --mark="${params.mark}"\n+\n+ #if $params.readGroup.specReadGroup == "yes"\n+ --rgid="${params.readGroup.rgid}"\n+ --rgsm="${params.readGroup.rgsm}"\n+ --rgpl="${params.readGroup.rgpl}"\n+ --rglb="${params.readGroup.rglb}"\n+ --rgpu="${params.readGroup.rgpu}"\n+ --rgcn="${params.readGroup.rgcn}"\n+ --rgds="${params.readGroup.rgds}"\n+ --rgdt="${params.readGroup.rgdt}"\n+ --rgfo="${params.readGroup.rgfo}"\n+ --rgks="${params.readGroup.rgks}"\n+ --rgpg="${params.readGroup.rgpg}"\n+ --rgpi="${params.readGroup.rgpi}"\n+ #end if\n+ #end if\n+\n+ ## suppress output SAM header\n+ --suppressHeader="${suppressHeader}"\n+ </command>\n+\n+ <inputs>\n+ <conditional name="genomeSource">\n+ <param name="refGenomeSource" type="select" label="Will you select a reference genome from your history or use a built-in index?">\n+ <option value="indexed">Use a built-in index</option>\n+ <option value="history">Use one from the history</option>\n+ </param>\n+ <when value="indexed">\n+ <param name="indices" type="select" label="Select a reference genome">\n+ <options from_data_table="bwa_indexes">\n+ <filter type="sort_by" column="2" />\n+ <validator type="no_options" message="No indexes are available" />\n+ </options>\n+ </param>\n+ </when>\n+ <when value="history">\n+ <param name="ownFile" type="data" format="fasta" metadata_name="dbkey" label="Select a reference from history" />\n+ </when>\n+ </conditional>\n+ <conditional name="paired">\n+ <param name="sPaired" type="select" label="Is this library mate-paired?">\n+ <option value="single">Single-end</option>\n+ <option value="paired">Paired-end</option>\n+ </param>\n+ <when value="single">\n+ <param name="fastq" type="data" format="fastqsanger,fastqillumina" label="FASTQ file" help="FASTQ with either Sanger-scaled quality values (fastqsanger) or Illumina-scaled quality values (fastqillumina)" />\n+ </when>\n+ <when value="paired">\n+ '..b'L)" help="Required if RG specified. Valid values : CAPILLARY, LS454, ILLUMINA, \n+SOLID, HELICOS, IONTORRENT and PACBIO" />\n+ <param name="rglb" type="text" size="25" label="[Essential]Library name (LB)" help="Required if RG specified" />\n+ <param name="rgsm" type="text" size="25" label="[Essential]Sample (SM)" help="Required if RG specified. Use pool name where a pool is being sequenced" />\n+ <param name="rgpu" type="text" size="25" label="Platform unit (PU)" help="Optional. Unique identifier (e.g. flowcell-barcode.lane for Illumina or slide for SOLiD)" />\n+ <param name="rgcn" type="text" size="25" label="Sequencing center that produced the read (CN)" help="Optional" />\n+ <param name="rgds" type="text" size="25" label="Description (DS)" help="Optional" />\n+ <param name="rgdt" type="text" size="25" label="Date that run was produced (DT)" help="Optional. ISO8601 format date or date/time, like YYYY-MM-DD" />\n+ <param name="rgfo" type="text" size="25" label="Flow order (FO). The array of nucleotide bases that correspond to the nucleotides used for each \n+\xef\xac\x82ow of each read." help="Optional. Multi-base \xef\xac\x82ows are encoded in IUPAC format, and non-nucleotide \xef\xac\x82ows by \n+various other characters. Format : /\\*|[ACMGRSVTWYHKDBN]+/" />\n+ <param name="rgks" type="text" size="25" label="The array of nucleotide bases that correspond to the key sequence of each read (KS)" help="Optional" />\n+ <param name="rgpg" type="text" size="25" label="Programs used for processing the read group (PG)" help="Optional" />\n+ <param name="rgpi" type="text" size="25" label="Predicted median insert size (PI)" help="Optional" />\n+ </when>\n+ <when value="no" />\n+ </conditional>\n+ </when>\n+ </conditional>\n+ <param name="suppressHeader" type="boolean" truevalue="true" falsevalue="false" checked="False" label="Suppress the header in the output SAM file" help="BWA produces SAM with several lines of header information" />\n+ </inputs>\n+\n+ <outputs>\n+ <data format="sam" name="output" label="${tool.name} on ${on_string}: mapped reads">\n+ <actions>\n+ <conditional name="genomeSource.refGenomeSource">\n+ <when value="indexed">\n+ <action type="metadata" name="dbkey">\n+ <option type="from_data_table" name="bwa_indexes" column="1">\n+ <filter type="param_value" column="0" value="#" compare="startswith" keep="False"/>\n+ <filter type="param_value" ref="genomeSource.indices" column="0"/>\n+ </option>\n+ </action>\n+ </when>\n+ <when value="history">\n+ <action type="metadata" name="dbkey">\n+ <option type="from_param" name="genomeSource.ownFile" param_attribute="dbkey" />\n+ </action>\n+ </when>\n+ </conditional>\n+ </actions>\n+ </data>\n+ </outputs>\n+\n+ <tests>\n+ <test>\n+ </test>\n+ <test>\n+ </test>\n+ <test>\n+ </test>\n+ </tests>\n+ <help>\n+**What it does**\n+\n+BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. BWA-MEM, which is the latest algorithm, is generally recommended for high-quality queries as it is faster and more accurate. BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads.\n+\n+------\n+\n+**Input formats**\n+\n+BWA accepts files in either Sanger FASTQ format (galaxy type *fastqsanger*) or Illumina FASTQ format (galaxy type *fastqillumina*). Use the FASTQ Groomer to prepare your files.\n+\n+------\n+\n+**License and citation**\n+\n+This tool uses `BWA`_, which is licensed separately. Please cite |Li2013|_.\n+\n+.. _BWA: http://bio-bwa.sourceforge.net/\n+.. |Li2013| replace:: Li, H. (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303.3997 [q-bio.GN]\n+.. _Li2013: http://arxiv.org/abs/1303.3997\n+ </help>\n+</tool>\n'

diff -r 000000000000 -r 6820983ba5d5 readme.rst
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/readme.rst Tue Mar 18 07:49:22 2014 -0400

@@ -0,0 +1,33 @@
+BWA-MEM wrapper
+===============
+
+This is a wrapper for BWA_, a software package for mapping low-divergent sequences against a large reference genome. This wrapper supports only the latest algorithm, BWA-MEM, which is generally recommended for high-quality queries as it is faster and more accurate. BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads.
+
+.. _BWA: http://bio-bwa.sourceforge.net/
+
+If you need a wrapper for the old BWA-backtrack algorithm, you may install http://toolshed.g2.bx.psu.edu/view/devteam/bwa_wrappers repository.
+
+Configuration
+-------------
+
+bwa_mem tool may be configured to use more than one CPU core by selecting an appropriate destination for this tool in Galaxy job_conf.xml file (see https://wiki.galaxyproject.org/Admin/Config/Jobs and https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster ).
+
+If you are using Galaxy release_2013.11.04 or later, this tool will automatically use the number of CPU cores allocated by the job runner according to the configuration of the destination selected for this tool.
+
+If instead you are using an older Galaxy release, you should also add a line
+
+ GALAXY_SLOTS=N; export GALAXY_SLOTS
+
+(where N is the number of CPU cores allocated by the job runner for this tool) to the file
+
+ <tool_dependencies_dir>/bwa/0.7.7/crs4/bwa_mem/<hash_string>/env.sh
+
+Version history
+---------------
+
+- Release 0: Initial release in the Tool Shed. This is a fork of http://toolshed.g2.bx.psu.edu/view/yufei-luo/bwa_0_7_5 repository with the following changes: Remove .loc file, only .loc.sample should be included. Fix bwa_index.loc.sample file to contain only comments. Add suppressHeader param as in bwa_wrappers. Use $GALAXY_SLOTS environment variable when available. Add <version_command> and <help>. Remove unused import. Fix spacing and typos. Use new recommended citation. Add tool_dependencies.xml . Rename to bwa_mem. Remove definitively colorspace support. Use optparse instead of argparse since Galaxy still supports Python 2.6 .
+
+Development
+-----------
+
+Development is hosted at https://bitbucket.org/nsoranzo/bwa_mem . Contributions and bug reports are very welcome!

diff -r 000000000000 -r 6820983ba5d5 tool-data/bwa_index.loc.sample
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tool-data/bwa_index.loc.sample Tue Mar 18 07:49:22 2014 -0400

@@ -0,0 +1,38 @@
+#This is a sample file distributed with Galaxy that enables tools
+#to use a directory of BWA indexed sequences data files. You will need
+#to create these data files and then create a bwa_index.loc file
+#similar to this one (store it in this directory) that points to
+#the directories in which those files are stored. The bwa_index.loc
+#file has this format (longer white space characters are TAB characters):
+#
+#<unique_build_id>   <dbkey>   <display_name>   <file_path>
+#
+#So, for example, if you had phiX indexed stored in
+#/depot/data2/galaxy/phiX/base/,
+#then the bwa_index.loc entry would look like this:
+#
+#phiX174   phiX   phiX Pretty   /depot/data2/galaxy/phiX/base/phiX.fa
+#
+#and your /depot/data2/galaxy/phiX/base/ directory
+#would contain phiX.fa.* files:
+#
+#-rw-r--r--  1 james    universe 830134 2005-09-13 10:12 phiX.fa.amb
+#-rw-r--r--  1 james    universe 527388 2005-09-13 10:12 phiX.fa.ann
+#-rw-r--r--  1 james    universe 269808 2005-09-13 10:12 phiX.fa.bwt
+#...etc...
+#
+#Your bwa_index.loc file should include an entry per line for each
+#index set you have stored. The "file" in the path does not actually
+#exist, but it is the prefix for the actual index files.  For example:
+#
+#phiX174 phiX phiX174 /depot/data2/galaxy/phiX/base/phiX.fa
+#hg18canon hg18 hg18 Canonical /depot/data2/galaxy/hg18/base/hg18canon.fa
+#hg18full hg18 hg18 Full /depot/data2/galaxy/hg18/base/hg18full.fa
+#/orig/path/hg19.fa hg19 hg19 /depot/data2/galaxy/hg19/base/hg19.fa
+#...etc...
+#
+#Note that for backwards compatibility with workflows, the unique ID of
+#an entry must be the path that was in the original loc file, because that
+#is the value stored in the workflow for that parameter. That is why the
+#hg19 entry above looks odd. New genomes can be better-looking.
+#

diff -r 000000000000 -r 6820983ba5d5 tool_data_table_conf.xml.sample
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tool_data_table_conf.xml.sample Tue Mar 18 07:49:22 2014 -0400

@@ -0,0 +1,7 @@
+<tables>
+    
+    <table name="bwa_indexes" comment_char="#">
+        <columns>value, dbkey, name, path</columns>
+        <file path="tool-data/bwa_index.loc" />
+    </table>
+</tables>

diff -r 000000000000 -r 6820983ba5d5 tool_dependencies.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tool_dependencies.xml Tue Mar 18 07:49:22 2014 -0400

@@ -0,0 +1,6 @@
+<?xml version="1.0"?>
+<tool_dependency>
+    <package name="bwa" version="0.7.7">
+        <repository changeset_revision="def70e393020" name="package_bwa_0_7_7" owner="iuc" toolshed="http://toolshed.g2.bx.psu.edu" />
+    </package>
+</tool_dependency>