# HG changeset patch # User devteam # Date 1434663340 14400 # Node ID ac30bfd3e2a8ed2163ff605f1c67a74740bd90c7 # Parent 607ca4b958372c7d20293e23e5e7b4055a556a5f planemo upload commit a50a3947aebc8a1d11bac39599f4efd8ed9a3bd5 diff -r 607ca4b95837 -r ac30bfd3e2a8 README.md --- a/README.md Fri Mar 20 12:21:16 2015 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,4 +0,0 @@ -bwa-mem -======= - -A collection of Galaxy wrapper for bwa mem, aln, samse, sampe, pemerge, and bwasw diff -r 607ca4b95837 -r ac30bfd3e2a8 README.rst --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README.rst Thu Jun 18 17:35:40 2015 -0400 @@ -0,0 +1,4 @@ +bwa-mem +======= + +A collection of Galaxy wrapper for bwa mem, aln, samse, sampe, pemerge, and bwasw diff -r 607ca4b95837 -r ac30bfd3e2a8 bwa-mem.xml --- a/bwa-mem.xml Fri Mar 20 12:21:16 2015 -0400 +++ b/bwa-mem.xml Thu Jun 18 17:35:40 2015 -0400 @@ -1,5 +1,5 @@ - + - map medium and long reads (> 100 bp) against reference genome bwa_macros.xml @@ -135,9 +135,9 @@ - - - + + + @@ -162,7 +162,7 @@ - + @@ -173,7 +173,7 @@ - + @@ -181,7 +181,7 @@ - + @@ -193,9 +193,9 @@ - - - + + + @@ -302,6 +302,8 @@ + + @@ -322,6 +324,20 @@ ----- +**Indices: Selecting reference genomes for BWA** + +Galaxy wrapper for BWA allows you select between precomputed and user-defined indices for reference genomes using **Will you select a reference genome from your history or use a built-in index?** flag. This flag has two options: + + 1. **Use a built-in genome index** - when selected (this is default), Galaxy provides the user with **Select reference genome index** dropdown. Genomes listed in this dropdown have been pre-indexed with bwa index utility and are ready to be mapped against. + 2. **Use a genome from the history and build index** - when selected, Galaxy provides the user with **Select reference genome sequence** dropdown. This dropdown is populated by all FASTA formatted files listed in your current history. If your genome of interest is uploaded into history it will be shown there. Selecting a genome from this dropdown will cause Galaxy to first transparently index it using `bwa index` command, and then run mapping with `bwa mem`. + +If your genome of interest is not listed here you have two choices: + + 1. Contact galaxy team using **Help->Support** link at the top of the interface and let us know that an index needs to be added + 2. Upload your genome of interest as a FASTA file to Galaxy history and selected **Use a genome from the history and build index** option. + +----- + **Galaxy-specific option** Galaxy allows four levels of control over bwa-mem options provided by **Select analysis mode** menu option. These are: diff -r 607ca4b95837 -r ac30bfd3e2a8 bwa.xml --- a/bwa.xml Fri Mar 20 12:21:16 2015 -0400 +++ b/bwa.xml Thu Jun 18 17:35:40 2015 -0400 @@ -1,5 +1,5 @@ - + - map short reads (< 100 bp) against reference genome bwa_macros.xml @@ -24,7 +24,7 @@ #end if #if str( $analysis_type.L ): - -B ${analysis_type.L} + -L ${analysis_type.L} #end if #end if @@ -250,11 +250,11 @@ - + - - - + + + @@ -387,6 +387,7 @@ + @@ -394,18 +395,32 @@ **What is does** -BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. The bwa-aln algorithm is designed for Illumina sequence reads up to 100bp. For longer reads use BWA-MEM algorithm distributed as separate Galaxy tool. +BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. The bwa-aln algorithm is designed for Illumina sequence reads up to 100bp. For longer reads use BWA-MEM algorithm distributed as a separate Galaxy tool. This Galaxy tool wraps bwa-aln, bwa-samse and -sampe modules of bwa read mapping tool: - - bwa aln - actual mapper placing reads onto the reference sequence - - bwa samse - post-processor converting suffix array coordinates into genome coordinates in SAM format for single reads - - bam sampe - post-processor for paired reads + - **bwa aln** - actual mapper placing reads onto the reference sequence + - **bwa samse** - post-processor converting suffix array coordinates into genome coordinates in SAM format for single reads + - **bam sampe** - post-processor for paired reads Galaxy implementation takes fastq or BAM (unaligned BAM) datasets as input and produces output in BAM (not SAM; in reality SAM produced by the bwa is converted to BAM on the fly by samtools view command) format, which can be further processed using various BAM utilities exiting in Galaxy (BAMTools, SAMTools, Picard). ----- +**Indices: Selecting reference genomes for BWA** + +Galaxy wrapper for BWA allows you select between precomputed and user-defined indices for reference genomes using **Will you select a reference genome from your history or use a built-in index?** flag. This flag has two options: + + 1. **Use a built-in genome index** - when selected (this is default), Galaxy provides the user with **Select reference genome index** dropdown. Genomes listed in this dropdown have been pre-indexed with bwa index utility and are ready to be mapped against. + 2. **Use a genome from the history and build index** - when selected, Galaxy provides the user with **Select reference genome sequence** dropdown. This dropdown is populated by all FASTA formatted files listed in your current history. If your genome of interest is uploaded into history it will be shown there. Selecting a genome from this dropdown will cause Galaxy to first transparently index it using `bwa index` command, and then run mapping with `bwa aln`. + +If your genome of interest is not listed here you have two choices: + + 1. Contact galaxy team using **Help->Support** link at the top of the interface and let us know that an index needs to be added + 2. Upload your genome of interest as a FASTA file to Galaxy history and selected **Use a genome from the history and build index** option. + +----- + **Galaxy-specific option** Galaxy allows three levels of control over bwa-mem options provided by **Select analysis mode** menu option. These are: diff -r 607ca4b95837 -r ac30bfd3e2a8 bwa_macros.xml --- a/bwa_macros.xml Fri Mar 20 12:21:16 2015 -0400 +++ b/bwa_macros.xml Thu Jun 18 17:35:40 2015 -0400 @@ -3,31 +3,31 @@ #set $rg_string = "@RG\tID:" + str($rg.ID) + "\tSM:" + str($rg.SM) + "\tPL:" + str($rg.PL) #if $rg.LB - #set $rg_string += "\tLB:$rg.LB" + #set $rg_string += "\tLB:" + str($rg.LB) #end if #if $rg.CN - #set $rg_string += "\tCN:$rg.CN" + #set $rg_string += "\tCN:" + str($rg.CN) #end if #if $rg.DS - #set $rg_string += "\tDS:$rg.DS" + #set $rg_string += "\tDS:" + str($rg.DS) #end if #if $rg.DT - #set $rg_string += "\tDT:$rg.DT" + #set $rg_string += "\tDT:" + str($rg.DT) #end if #if $rg.FO - #set $rg_string += "\tFO:$rg.FO" + #set $rg_string += "\tFO:" + str($rg.FO) #end if #if $rg.KS - #set $rg_string += "\tKS:$rg.KS" + #set $rg_string += "\tKS:" + str($rg.KS) #end if #if $rg.PG - #set $rg_string += "\tPG:$rg.PG" + #set $rg_string += "\tPG:" + str($rg.PG) #end if #if str($rg.PI) - #set $rg_string += "\tPI:$rg.PI" + #set $rg_string += "\tPI:" + str($rg.PI) #end if #if $rg.PU - #set $rg_string += "\tPU:$rg.PU" + #set $rg_string += "\tPU:" + str($rg.PU) #end if @@ -38,7 +38,7 @@ **Read Groups are Important!** -One of the recommended best practices in NGS analysis is adding read group information to BAM files. You can do thid directly in BWA interface using the +One of the recommended best practices in NGS analysis is adding read group information to BAM files. You can do this directly in BWA interface using the **Specify read group information?** widget. If you are not familiar with read groups you shold know that this is effectively a way to tag reads with an additional ID. This allows you to combine BAM files from, for example, multiple BWA runs into a single dataset. This significantly simplifies downstream processing as instead of dealing with multiple datasets you only have to handle only one. This is possible because the read group information allows you to identify @@ -104,13 +104,13 @@ **Dataset collections - processing large numbers of datasets at once** -This will be added shortly +Dataset collections are in beta-testing. Extensive documentation will be added later this Spring. - + @@ -122,7 +122,7 @@ - + diff -r 607ca4b95837 -r ac30bfd3e2a8 shed_upload.tar.gz Binary file shed_upload.tar.gz has changed diff -r 607ca4b95837 -r ac30bfd3e2a8 test-data/bwa-aln-test1.bam Binary file test-data/bwa-aln-test1.bam has changed diff -r 607ca4b95837 -r ac30bfd3e2a8 test-data/bwa-aln-test2.bam Binary file test-data/bwa-aln-test2.bam has changed diff -r 607ca4b95837 -r ac30bfd3e2a8 test-data/bwa-aln-test3.bam Binary file test-data/bwa-aln-test3.bam has changed diff -r 607ca4b95837 -r ac30bfd3e2a8 test-data/bwa-mem-test1.bam Binary file test-data/bwa-mem-test1.bam has changed diff -r 607ca4b95837 -r ac30bfd3e2a8 test-data/bwa-mem-test2.bam Binary file test-data/bwa-mem-test2.bam has changed