Galaxy |

Changeset 0:4edac0183857 (2012-03-05)

Next changeset 1:560aaa69b532 (2012-03-05)

Commit message:
Initial commit from tarball version 1.17

added:
rsem/README
rsem/rsem-1.1.17.xml
rsem/rsem-wrapper-1.1.17.pl
rsem/rsem_indices.loc.example

diff -r 000000000000 -r 4edac0183857 rsem/README
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/rsem/README Mon Mar 05 11:12:34 2012 -0500

[

@@ -0,0 +1,92 @@
+# RSEM Galaxy Wrapper #
+
+## Introduction ##
+
+RSEM (RNA-Seq by Expectation-Maximization) is a software package for the
+estimation of gene and isoform abundances from RNA-Seq data. A key feature of
+RSEM is its statistically-principled approach to the handling of RNA-Seq
+reads that map to multiple genes and/or isoforms. In addition, RSEM is
+well-suited to performing quantification with de novo transcriptome
+assemblies, as it does not require a reference genome.
+
+## Installation ##
+
+Follow the [Galaxy Tool Shed
+instructions](http://wiki.g2.bx.psu.edu/Tool_Shed) to add this wrapper from
+the tool shed to your galaxy instance. Once the files are in the tools
+directory you have to have RSEM references installed. This can be done by:
+
+1. Placing the file called `rsem_indices.loc` into the directory
+   `~/galaxy-dist/tool-data` This file tells the RSEM wrapper how to find the
+   reference(s). It is formatted according to galaxy's documentation with the
+   following tab-delimited format:
+
+        unique_build_id    dbkey    display_name    file_base_path
+
+   For example,
+
+        human_refseq_NM human_refseq_NM human_refseq_NM /opt/galaxy/references/human/1.1.2/NM_refseq_ref
+
+2. Downloaded a pre-built RSEM reference from the [RSEM website](http://deweylab.biostat.wisc.edu/rsem/).
+
+3. Place reference files into the `file_base_path` listed in the
+`rsem_indices.loc` file
+
+If you would rather build your own reference files follow the instructions
+below and then place resulting reference files into the `file_base_path` listed
+in the `rsem_indices.loc` file.
+
+### Building a custom RSEM reference ###
+
+For instructions on how to build the RSEM reference files, first see the [RSEM
+documentation](http://deweylab.biostat.wisc.edu/rsem/README.html).
+
+#### Example ####
+
+Suppose we have mouse RNA-Seq data and want to use the UCSC mm9 version of the
+mouse genome. We have downloaded the UCSC Genes transcript annotations in GTF
+format (as mm9.gtf) using the Table Browser and the knownIsoforms.txt file for
+mm9 from the UCSC Downloads. We also have all chromosome files for mm9 in the
+directory `/data/mm9`. We want to put the generated reference files under
+`/opt/galaxy/references` with name `mouse_125`. We'll add poly(A) tails with
+length 125. Please note that GTF files generated from UCSC's Table Browser do
+not contain isoform-gene relationship information. For the UCSC Genes
+annotation, this information can be obtained from the knownIsoforms.txt file.
+Suppose we want to build Bowtie indices and Bowtie executables are found in
+`/sw/bowtie`.
+
+To build the reference files, first run the command:
+
+    rsem-prepare-reference --gtf mm9.gtf \
+                           --transcript-to-gene-map knownIsoforms.txt \
+                           --bowtie-path /sw/bowtie \
+                           /data/mm9/chr1.fa,/data/mm9/chr2.fa,...,/data/mm9/chrM.fa \
+                           /opt/galaxy/references/mouse_125
+
+To add this reference to your galaxy installation, add the following line to
+the the `rsem_indices.loc` file:
+
+    mouse_125 mouse_125 mouse_125 /opt/galaxy/references/mouse_125
+
+Then restart galaxy and you should see the `mouse_125` reference listed in the
+RSEM wrapper.
+
+## References ##
+
+* [RSEM website (stand alone package)](http://deweylab.biostat.wisc.edu/rsem/)
+
+* B. Li and C. Dewey (2011) [RSEM: accurate transcript quantification from
+  RNA-Seq data with or without a reference
+  genome](http://bioinformatics.oxfordjournals.org/content/26/4/493.abstract).
+  BMC Bioinformatics 12:323.
+
+* B. Li, V. Ruotti, R. Stewart, J. Thomson, and C. Dewey (2010) [RNA-Seq gene
+  expression estimation with read mapping
+  uncertainty](http://www.biomedcentral.com/1471-2105/12/323). Bioinformatics
+  26(4): 493-500.
+
+## Contact information ##
+* RSEM galaxy wrapper questions: ruotti@wisc.edu
+* RSEM stand alone package questions: bli@cs.wisc.edu
+* [RSEM announcements mailing list](http://groups.google.com/group/rsem-announce)
+* [RSEM users mailing list](http://groups.google.com/group/rsem-users)

diff -r 000000000000 -r 4edac0183857 rsem/rsem-1.1.17.xml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/rsem/rsem-1.1.17.xml Mon Mar 05 11:12:34 2012 -0500

b'@@ -0,0 +1,572 @@\n+<tool id="live-rsem-1.1.17" name="RSEM-1.17" version="xml.1.1.17">\n+ <description>RNA-Seq by Expectation-Maximization</description>\n+ <command interpreter="perl">\n+\n+## DEFAULT PARAMETERS\n+rsem-wrapper-1.1.17.pl --calc-ci $useci.ci --fragment-length-mean $fraglenmean --fragment-length-min \n+$fraglenmin --fragment-length-sd $fraglensd --fragment-length-max $fraglenmax --bowtie-e \n+$bowtie_e --bowtie-m $bowtie_m\n+\n+ #if $input.format=="fastq"\n+ ## IF FASTQ AND SINGLE END READS (DEFAULTS)\n+\t#if $input.fastqmatepair.matepair=="single" #rsem-wrapper-1.1.17.pl --bam_genome $bam_genome --bamtype $bamtype \n+\t--seed-length $seedlength $input.fastq_select --estimate-rspd $rspd --forward-prob \n+\t$fprob -p $cpus --bowtie-n $bowtie_mis --output-genome-bam --single_fastq $singlefastq \n+\t--output $output --isoformfile $isoforms --bamfile $bam_res --log $log \n+\t--sampling-for-bam $sampling_for_bam --reference ${index.fields.path}\n+ \t#end if\n+ ## IF FASTQ AND PAIRED END READS (DEFAULTS)\n+ #if $input.fastqmatepair.matepair=="paired" #rsem-wrapper-1.1.17.pl --bam_genome $bam_genome --bamtype $bamtype \n+\t--paired-end --seed-length $seedlength --estimate-rspd $rspd $input.fastq_select --forward-prob $fprob -p $cpus \n+ --bowtie-n $bowtie_mis --output-genome-bam --fastq1 $fastq1 --fastq2 $fastq2 --output \n+ $output --isoformfile $isoforms --bamfile $bam_res --log $log --sampling-for-bam \n+ $sampling_for_bam --reference ${index.fields.path} \n+\t#end if\n+ #end if \n+ #if $input.format=="fasta"\n+ ## IF FASTA AND SINGLE END READS (DEFAULTS)\n+\t#if $input.fastamatepair.matepair=="single" #rsem-wrapper-1.1.17.pl --bam_genome $bam_genome --bamtype $bamtype\n+\t--no-qualities --seed-length $seedlength --estimate-rspd $rspd --forward-prob $fprob -p $cpus --bowtie-n $bowtie_mis \n+\t--output-genome-bam --single_fasta $single_fasta --output $output --isoformfile \n+\t$isoforms --bamfile $bam_res --log $log --sampling-for-bam $sampling_for_bam --reference \n+\t${index.fields.path}\n+ #end if\n+ ## IF FASTA AND PAIRED END READS (DEFAULTS)\n+ #if $input.fastamatepair.matepair=="paired" #rsem-wrapper-1.1.17.pl --bam_genome $bam_genome --bamtype $bamtype\n+\t--no-qualities --paired-end --seed-length $seedlength --estimate-rspd $rspd --forward-prob $fprob -p $cpus \n+ --bowtie-n $bowtie_mis --output-genome-bam --fasta1 $fasta1 --fasta2 $fasta2 --output \n+ $output --isoformfile $isoforms --bamfile $bam_res --log $log --sampling-for-bam \n+ $sampling_for_bam --reference ${index.fields.path} \n+\t#end if\n+ #end if\n+\n+ </command>\n+\n+ <inputs>\n+ <param name="sample" type="text" format="txt" label="Sample label" />\n+ <conditional name="input">\n+\t<param name="format" type="select" label="Input file type">\n+\t\t<option value="fastq">FASTQ</option>\n+\t\t<option value="fasta">FASTA</option>\n+\t</param>\n+ \t<when value="fastq">\n+ <param name="fastq_select" size="15" type="select" label="FASTQ type" >\n+\t\t\t\t\t\t<option value="--phred33-quals">phred33 qualities</option>\n+\t\t\t\t\t\t<option value="--solexa-quals">solexa qualities</option>\n+\t\t\t\t\t\t<option value="--phred64-quals">phred64 qualities</option>\n+\t\t </param>\n+\n+\t<conditional name="fastqmatepair">\n+ \t<when value="single">\n+ \t<param name="singlefastq" type="data" checked="yes" format="fastq" label="FASTQ file" />\n+\t\t\t</when>\n+ \t<when value="paired">\n+ \t<param name="fastq1" type="data" format="fastq" label="Read 1 fastq file" />\n+ <param name="fastq2" type="data" format="fastq" label="Read 2 fastq file" />\n+ \t\t\t</when>\n+ \t\t<param name="matepair" type="select" label="Library type">\n+\t\t\t<option value="single">Single End Reads</option>\n+\t\t\t<option value="paired">Paired End Reads</option>\n+\t\t</param>\n+ \t</conditional>\n+\t</when>\n+ \t<when value="fasta">\n+\t<conditional name="fastamatepair">\n+ \t<param name="matepair" type="select" label="Library Type">\n+\t\t\t<option v'..b"f a read. In addition, RSEM pads a\n+ new tag ZW:f:value, where value is a single precision floating\n+ number representing the posterior probability. If an alignment is\n+ spliced, a XS:A:value tag is also added, where value is either '+'\n+ or '-' indicating the strand of the transcript it aligns to.\n+\n+ 'sample_name.genome.sorted.bam' and\n+ 'sample_name.genome.sorted.bam.bai' are the sorted BAM file and\n+ indices generated by samtools (included in RSEM package).\n+\n+ sample_name.sam.gz\n+ Only generated when the input files are raw reads instead of SAM/BAM\n+ format files\n+\n+ It is the gzipped SAM output produced by bowtie aligner.\n+\n+ sample_name.time\n+ Only generated when --time is specified.\n+\n+ It contains time (in seconds) consumed by aligning reads, estimating\n+ expression levels and calculating credibility intervals.\n+\n+ sample_name.stat\n+ This is a folder instead of a file. All model related statistics are\n+ stored in this folder. Use 'rsem-plot-model' can generate plots\n+ using this folder.\n+\n+EXAMPLES\n+ Assume the path to the bowtie executables is in the user's PATH\n+ environment variable. Reference files are under '/ref' with name\n+ 'mouse_125'.\n+\n+ 1) '/data/mmliver.fq', single-end reads with quality scores. Quality\n+ scores are encoded as for 'GA pipeline version >= 1.3'. We want to use 8\n+ threads and generate a genome BAM file:\n+\n+ rsem-calculate-expression --phred64-quals \\\n+ -p 8 \\\n+ --output-genome-bam \\\n+ /data/mmliver.fq \\\n+ /ref/mouse_125 \\\n+ mmliver_single_quals\n+\n+ 2) '/data/mmliver_1.fq' and '/data/mmliver_2.fq', paired-end reads with\n+ quality scores. Quality scores are in SANGER format. We want to use 8\n+ threads and do not generate a genome BAM file:\n+\n+ rsem-calculate-expression -p 8 \\\n+ --paired-end \\\n+ /data/mmliver_1.fq \\\n+ /data/mmliver_2.fq \\\n+ /ref/mouse_125 \\\n+ mmliver_paired_end_quals\n+\n+ 3) '/data/mmliver.fa', single-end reads without quality scores. We want\n+ to use 8 threads:\n+\n+ rsem-calculate-expression -p 8 \\\n+ --no-qualities \\\n+ /data/mmliver.fa \\\n+ /ref/mouse_125 \\\n+ mmliver_single_without_quals\n+\n+ 4) Data are the same as 1). We want to take a fragment length\n+ distribution into consideration. We set the fragment length mean to 150\n+ and the standard deviation to 35. In addition to a BAM file, we also\n+ want to generate credibility intervals. We allow RSEM to use 1GB of\n+ memory for CI calculation:\n+\n+ rsem-calculate-expression --bowtie-path /sw/bowtie \\\n+ --phred64-quals \\\n+ --fragment-length-mean 150.0 \\\n+ --fragment-length-sd 35.0 \\\n+ -p 8 \\\n+ --output-genome-bam \\\n+ --calc-ci \\\n+ --ci-memory 1024 \\\n+ /data/mmliver.fq \\\n+ /ref/mouse_125 \\\n+ mmliver_single_quals\n+\n+ 5) '/data/mmliver_paired_end_quals.bam', paired-end reads with quality\n+ scores. We want to use 8 threads:\n+\n+ rsem-calculate-expression --paired-end \\\n+ --bam \\\n+ -p 8 \\\n+ /data/mmliver_paired_end_quals.bam \\\n+ /ref/mouse_125 \\\n+ mmliver_paired_end_quals\n+ </help> \n+</tool> \n"

diff -r 000000000000 -r 4edac0183857 rsem/rsem-wrapper-1.1.17.pl
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/rsem/rsem-wrapper-1.1.17.pl Mon Mar 05 11:12:34 2012 -0500

[

b'@@ -0,0 +1,238 @@\n+#!/usr/bin/perl \n+\n+\n+use Data::Dumper;\n+use Getopt::Long;\n+use Pod::Usage;\n+\n+\n+\n+#pod2usage(-verbose => 1) if ($help == 1);\n+#if (@ARGV == 0) {\n+#\tpod2usage(-msg => "Invalid number of arguments!", -exitval => 2, -verbose => 2);\n+#}\n+\n+my $rsem_version = "/opt/rsem-1.1.17";\n+my $minL = 1;\n+my $maxL = 1000;\n+my $NMB = 1024;\n+\n+# Extra file output #beta\n+# --isoformfile $isoforms \n+# --thetafile $theta \n+# --cntfile $cnt \n+# --modelfile $model \n+# --bamfile $bam_res\n+\n+GetOptions(\n+ "log=s" => \\$log,\n+ "bam_genome=s" => \\$bam_genome,\n+ "bamtype=s" => \\$bamtype,\n+ "isoformfile=s" => \\$isoforms,\n+ "reference=s" => \\$dbref,\n+ "sampling-for-bam=s" => \\$samplingbam,\n+ "thetafile=s" => \\$theta,\n+ "cntfile=s" => \\$cnt,\n+ "modelfile=s" => \\$model,\n+ "bamfile=s" => \\$bamfile,\n+ "output=s" => \\$output,\n+ "single_fasta=s" => \\$single_fasta,\n+ "fasta1=s" => \\$fasta1,\n+ "fasta2=s" => \\$fasta2,\n+ "single_fastq=s" => \\$single_fastq,\n+ "fastq1=s" => \\$fastq1,\n+ "fastq2=s" => \\$fastq2,\n+ "no-qualities" => \\$no_qual,\n+ "paired-end" => \\$paired_end,\n+ "sam" => \\$is_sam,\n+ "bam" => \\$is_bam,\n+ "sam-header-info=s" => \\$fn_list,\n+ "tag=s" => \\$tagName,\n+ "seed-length=i" => \\$L,\n+ "bowtie-path=s" => \\$bowtie_path,\n+ "bowtie-n=i" => \\$C,\n+ "bowtie-e=i" => \\$E,\n+ "bowtie-m=i" => \\$maxHits,\n+ "phred33-quals" => \\$phred33,\n+ "phred64-quals" => \\$phred64,\n+ "solexa-quals" => \\$solexa,\n+ "forward-prob=f" => \\$probF,\n+ "fragment-length-min=i" => \\$minL,\n+ "fragment-length-max=i" => \\$maxL,\n+ "fragment-length-mean=f" => \\$mean,\n+ "fragment-length-sd=f" => \\$sd,\n+ "estimate-rspd=s" => \\$estRSPD,\n+ "num-rspd-bins=i" => \\$B,\n+ "p|num-threads=i" => \\$nThreads,\n+ "output-genome-bam" => \\$genBamF,\n+ "calc-ci=s" => \\$calcCI,\n+ "ci-memory=i" => \\$NMB,\n+ "time" => \\$mTime,\n+ "q|quiet" => \\$quiet,\n+) or pod2usage( -exitval => 2, -verbose => 2 );\n+\n+#check parameters and options\n+\n+if ($is_sam || $is_bam) {\n+ pod2usage(-msg => "from rsem-wrapper->Invalid number of arguments!", -exitval => 2, -verbose => 2) if (scalar(@ARGV) != 4);\n+ pod2usage(-msg => "--sam and --bam cannot be active at the same time!", -exitval => 2, -verbose => 2) if ($is_sam == 1&& $is_bam == 1);\n+ pod2usage(-msg => "--bowtie-path, --bowtie-n, --bowtie-e, --bowtie-m, --phred33-quals, --phred64-quals or --solexa-quals cannot be set if input is SAM/BAM format!", -exitval => 2, -verbose => 2) if ($bowtie_path ne "" || $C != 2 || $E != 99999999 || $maxHits != 200 || $phred33 || $phred64 || $solexa);\n+}\n+#else {\n+# pod2usage(-msg => "from rsem-wraper->Invalid number of arguments!", -exitval => 2, -verbose => 2) \n+#\tif (!$paired_end && scalar(@ARGV) != 1 || $paired_end && scalar(@ARGV) != 1); \n+# pod2usage(-msg => "Only one of --phred33-quals --phred64-quals/--solexa1.3-quals --solexa-suqls can be active!", -exitval => 2, -verbose => 2) if ($phred33 + $phred64 + $solexa > 1); \n+# podwusage(-msg => "--sam , --bam or --sam-header-info cannot be set if use bowtie aligner to produce alignments!", -exitval => 2, -verbose => 2) if ($is_sam || $is_bam || $fn_list ne "");\n+#}\n+\n+pod2usage(-msg => "Forward probability should be in [0, 1]!", -exitval => 2, -verbose => 2) if ($probF < 0 || $probF > 1);\n+pod2usage(-msg => "Min fragment length should be at least 1!", -exitval => 2, -verbose => 2) if ($minL < 1);\n+pod2usage(-msg => "Min fragment length should be smaller or equal to max fragment len'..b'*ERROR, \'w\' ) or die "cant open file $!\\n";\n+# \n+\n+my @options;\n+\n+# generates new output called sample_name.genome.bam \n+# with alignments\n+# mapped to genomic coordinates and annotated with their posterior\n+# probabilities. In addition, RSEM will call samtools (included in\n+# RSEM package) to sort and index the bam file.\n+# \'sample_name.genome.sorted.bam\' and\n+# \'sample_name.genome.sorted.bam.bai\' will be generated. (Default: off)\n+\n+if ($bamtype eq "yes") {\n+ my $bam_genome_par = "--output-genome-bam";\n+ push @options, $bam_genome_par;\n+}\n+if ($samplingbam eq "yes") {\n+ my $samplingbam = "--sampling-for-bam";\n+ push @options, $samplingbam;\n+}\n+if ($estRSPD eq "yes") {\n+ my $rspd = "--estimate-rspd";\n+ push @options, $rspd;\n+}\n+$probF = "--forward-prob $probF";\n+push @options, $probF;\n+\n+if ($calcCI eq "yes") {\n+ my $calcCI = "--calc-ci";\n+ push @options, $calcCI;\n+ my $cimem = "--ci-memory $NMB";\n+ push @options, $cimem;\n+}\n+if ($tagName) {\n+ my $tagName = "--tag $tagName";\n+ push @options, $tagName;\n+}\n+if ($L) {\n+\tmy $L = "--seed-length $L";\n+\tpush @options, $L;\n+}\n+if ($C) {\n+ my $C = "--bowtie-n $C";\n+ push @options, $C;\n+}\n+if ($E) {\n+ my $E = "--bowtie-e $E";\n+ push @options, $E;\n+}\n+if ($maxHits) {\n+ my $maxHits = "--bowtie-m $maxHits";\n+ push @options, $maxHits;\n+}\n+if ($minL != 1) {\n+ my $minL = "--fragment-length-min $minL";\n+ push @options, $minL;\n+}\n+if ($maxL != 1000) {\n+ my $maxL = "--fragment-length-max $maxL";\n+ push @options, $maxL;\n+}\n+if ($mean) {\n+ my $mean = "--fragment-length-mean $mean";\n+ push @options, $mean;\n+}\n+if ($sd) {\n+ my $sd = "--fragment-length-sd $sd";\n+ push @options, $sd;\n+}\n+my $options= join(" ", @options);\n+\n+#BUILD COMMAND BASED ON PARSED OPTIONS\n+if ($no_qual) { \n+ #reads are in fasta file format\n+\tif ($paired_end) { # reads are in paired end\n+\t my $cmd = "$rsem_version/rsem-calculate-expression --quiet --no-qualities --paired-end -p $nThreads $options $fasta1 $fasta2 $dbref $output";\n+ \t print "RSEM Parameters used by Galaxy:\\n$cmd\\n";\n+\t system($cmd);\n+\t}\n+\t#run single end with one fasta file\n+\telse {\n+\tmy $cmd = "$rsem_version/rsem-calculate-expression --quiet --no-qualities -p $nThreads $options $single_fasta $dbref $output";\n+ \tprint "RSEM Parameters used by Galaxy:\\n$cmd\\n";\n+\tsystem($cmd);\n+\t}\n+}\n+else {\n+ # reads are in fastq file format\n+ # type of fastq file?\n+\tmy $fastqtype;\n+\tif ($phred33) {\n+\t\t$fastqtype = "--phred33-quals";\n+\t}\n+\telsif ($phred64) {\n+\t\t$fastqtype = "--phred64-quals";\n+\t}\n+\telsif ($solexa) {\n+\t\t$fastqtype = "--solexa-quals";\n+\t}\n+\tif ($paired_end) { \n+\t#reads in paired end \n+ #run paired end with two fasq files\n+\t my $cmd = "$rsem_version/rsem-calculate-expression --quiet --paired-end -p $nThreads $options $fastqtype $fastq1 $fastq2 $dbref $output";\n+ \t print "RSEM Parameters used by Galaxy:\\n$cmd\\n";\n+\t system($cmd);\n+\t}\n+\telse { \n+\tmy $cmd = "$rsem_version/rsem-calculate-expression --quiet -p $nThreads $options $fastqtype $single_fastq $dbref $output";\n+ \tprint "RSEM Parameters used by Galaxy:\\n$cmd\\n";\n+\tsystem($cmd);\n+\t}\n+}\n+\n+ #Rename files for galaxy\n+my $mv_genes = "mv $output.genes.results $output";\n+my $mv_isoforms = "mv $output.isoforms.results $isoforms";\n+\n+#print "bamtype-parameter=$bamtype\\n";\n+my $mv_bam_transcript;\n+my $mv_bam_genome;\n+if ($bamtype eq "yes") {\n+ $mv_bam_genome = "mv $output.genome.sorted.bam $bam_genome";\n+ system($mv_bam_genome);\n+}\n+\n+$mv_bam_transcript = "mv $output.transcript.sorted.bam $bamfile";\n+\n+my @rsem_dir = split(/\\//, $output);\n+my $short_output = $rsem_dir[-1];\n+my $mv_theta = "mv $output.stat/$short_output.theta $theta";\n+my $mv_cnt = "mv $output.stat/$short_output.cnt $cnt";\n+my $mv_model = "mv $output.stat/$short_output.model $model";\n+system($mv_genes);\n+system($mv_isoforms);\n+system($mv_bam_transcript);\n+#system($mv_theta);\n+#system($mv_cnt);\n+#system($mv_model);\n+#print "LOG $mv\\n";\n'

diff -r 000000000000 -r 4edac0183857 rsem/rsem_indices.loc.example
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/rsem/rsem_indices.loc.example Mon Mar 05 11:12:34 2012 -0500

@@ -0,0 +1,39 @@
+#This is a sample file distributed with Galaxy that enables tools
+#to use a directory of Bowtie indexed sequences data files. You will
+#need to create these data files and then create a bowtie_indices.loc
+#file similar to this one (store it in this directory) that points to
+#the directories in which those files are stored. The bowtie_indices.loc
+#file has this format (longer white space characters are TAB characters):
+#
+#<unique_build_id>   <dbkey>   <display_name>   <file_base_path>
+#
+#So, for example, if you had hg18 indexed stored in
+#/depot/data2/galaxy/bowtie/hg18/,
+#then the bowtie_indices.loc entry would look like this:
+#
+#hg18   hg18   hg18   /depot/data2/galaxy/bowtie/hg18/hg18
+human_refseq_NM human_refseq_NM human_refseq_NM /opt/galaxy/references/human/1.1.2/NM_refseq_ref
+
+#
+#and your /depot/data2/galaxy/bowtie/hg18/ directory
+#would contain hg18.*.ebwt files:
+#
+#-rw-r--r--  1 james    universe 830134 2005-09-13 10:12 hg18.1.ebwt
+#-rw-r--r--  1 james    universe 527388 2005-09-13 10:12 hg18.2.ebwt
+#-rw-r--r--  1 james    universe 269808 2005-09-13 10:12 hg18.3.ebwt
+#...etc...
+#
+#Your bowtie_indices.loc file should include an entry per line for each
+#index set you have stored. The "file" in the path does not actually
+#exist, but it is the prefix for the actual index files. For example:
+#
+#hg18canon          hg18   hg18 Canonical   /depot/data2/galaxy/bowtie/hg18/hg18canon
+#hg18full           hg18   hg18 Full        /depot/data2/galaxy/bowtie/hg18/hg18full
+#/orig/path/hg19    hg19   hg19             /depot/data2/galaxy/bowtie/hg19/hg19
+#...etc...
+#
+#Note that for backwards compatibility with workflows, the unique ID of
+#an entry must be the path that was in the original loc file, because that
+#is the value stored in the workflow for that parameter. That is why the
+#hg19 entry above looks odd. New genomes can be better-looking.
+#