small RNA oriented bowtie sRbowtie_wrapper.py $input $method $v_mismatches $output_type $refGenomeSource.genomeSource ## the very source of the index (indexed or fasta file) #if $refGenomeSource.genomeSource == "history": $refGenomeSource.ownFile #else: $refGenomeSource.index #end if ## $output $aligned $unaligned additional_fasta == "al" or additional_fasta == "al_and_unal" additional_fasta == "unal" or additional_fasta == "al_and_unal" **What it does** Bowtie_ is a short read aligner designed to be ultrafast and memory-efficient. It is developed by Ben Langmead and Cole Trapnell. Please cite: Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10:R25. .. _Bowtie: http://bowtie-bio.sourceforge.net/index.shtml A generic "Map with Bowtie for Illumina" Galaxy tool is available in the main Galaxy distribution. However, this useful Bowtie wrapper tool only takes as inputs FASTQ files. Our sRbowtie wrapper is intented to work specifically with short reads FASTA inputs and to serve downstream small RNA sequencing analyses ------ **OPTIONS** .. class:: infomark This script uses Bowtie to match reads on a reference index. Depending on the type of matching, different bowtie options are used: **Match on sense strand RNA reference index, multiple mappers randomly matched at a single position** Match on RNA reference, SENSE strand, randomly attributing multiple mapper to target with least mismatches, the polarity column is suppressed in the bowtie tabular report: *-v [0,1,2,3] -M 1 --best --strata -p 12 --norc --suppress 2,6,7,8* **Match unique mappers on DNA reference index** Match ONLY unique mappers on DNA reference index *-v [0,1,2,3] -m 1 -p 12 --suppress 6,7,8* Note that using this option with -v values other than 0 is questionnable... **Match on DNA, multiple mappers randomly matched at a single position** Match multiple mappers, randomly attributing multiple mapper to target with least mismatches, number of mismatch allowed specified by -v option: *-v [0,1,2,3] -M 1 --best --strata -p 12 --suppress 6,7,8* **Match on DNA as fast as possible, without taking care of mapping issues (for raw annotation of reads)** Match with highest speed, not guaranteeing best hit for speed gain: *-v [0,1,2,3] -k 1 --best -p 12 --suppress 6,7,8* ----- **Input formats** .. class:: warningmark *The only accepted format for the script is a raw fasta list of reads, clipped from their adapter* ----- **OUTPUTS** If you choose tabular as the output format, you will obtain the matched reads in standard bowtie output format, having the following columns:: Column Description -------- -------------------------------------------------------- 1 FastaID fasta identifier 2 polarity + or - depending whether the match was reported on the forward or reverse strand 3 target name of the matched target 4 Offset O-based coordinate of the miR on the miRBase pre-miR sequence 5 Seq sequence of the matched Read If you choose SAM, you will get the output in unordered SAM format. .. class:: warningmark if you choose BAM, the output will be in sorted BAM format. To be viewable in Trackster, several condition must be fulfilled: .. class:: infomark Reads must have been matched to a FULL genome whose chromosome names are compatible with Trackster genome indexes .. class:: infomark the database/Build (dbkey) which is indicated for the dataset (Pencil - Database/Build field) must match a Trackster genome index. Please contact the GED galaxy team is your reference genome is not referenced properly in GED galaxy **Optionnal matched and unmatched fasta reads can be obtained, for further annotations**