# HG changeset patch # User jjohnson # Date 1651004489 0 # Node ID 8ed8af5836d1c91ce35ece699e6834f4416cde4e # Parent c58d1774c76225a1be83061473fbddc2479b125e "planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/arriba commit e0aa03add09ecc4ad5a5d41c439b8af9551fc53c" diff -r c58d1774c762 -r 8ed8af5836d1 arriba.xml --- a/arriba.xml Fri Feb 11 19:04:06 2022 +0000 +++ b/arriba.xml Tue Apr 26 20:21:29 2022 +0000 @@ -2,6 +2,12 @@ detect gene fusions from STAR aligned RNA-Seq data macros.xml + + + + + + @@ -39,6 +45,31 @@ #else #set $star_index_dir = $input_params.index.arriba_ref.fields.star_index #end if + #if $blacklist + #if $blacklist.is_of_type('tabular.gz') + #set $blacklist_file = 'blacklist.tsv.gz' + ln -sf '$blacklist' $blacklist_file && + #else + #set $blacklist_file = $blacklist + #end if + #end if + #if $known_fusions + #if $known_fusions.is_of_type('tabular.gz') + #set $known_fusions_file = 'known_fusions.tsv.gz' + ln -sf '$known_fusions' $known_fusions_file && + #else + #set $known_fusions_file = $known_fusions + #end if + #end if + #if $tags + #if $tags.is_of_type('tabular.gz') + #set $tags_file = 'tags.tsv.gz' + ln -sf '$tags' $tags_file && + #else + #set $tags_file = $tags + #end if + #end if + STAR --runThreadN \${GALAXY_SLOTS:-1} --genomeDir $star_index_dir @@ -74,7 +105,7 @@ -a '$genome_assembly' -g '$genome_annotation' #if $blacklist - -b '$blacklist' + -b '$blacklist_file' #else -f 'blacklist' #end if @@ -82,10 +113,10 @@ -p '$protein_domains' #end if #if $known_fusions - -k '$known_fusions' + -k '$known_fusions_file' #end if #if $tags - -t '$tags' + -t '$tags_file' #end if #if str($wgs.use_wgs) == "yes" -d '$wgs.wgs' @@ -177,9 +208,16 @@ && samtools sort -@ \${GALAXY_SLOTS:-1} -m 4G -T tmp -O bam '$input_params.input' > Aligned.sortedByCoord.out.bam && samtools index Aligned.sortedByCoord.out.bam #end if +#if $output_fusions_vcf + && convert_fusions_to_vcf.sh '$genome_assembly' fusions.tsv fusions.vcf +#end if +#if $output_fusion_bams + && mkdir fusion_bams + && extract_fusion-supporting_alignments.sh fusions.tsv Aligned.sortedByCoord.out.bam 'fusion_bams/fusion' +#end if #if str($visualization.do_viz) == "yes" -#set $fusions = 'fusions.tsv' -&& @DRAW_FUSIONS@ + #set $fusions = 'fusions.tsv' + && @DRAW_FUSIONS@ #end if ]]> @@ -189,7 +227,9 @@ - + + + @@ -423,6 +463,8 @@ + + @@ -433,13 +475,23 @@ - - + + + + output_fusions_discarded == True + + + output_fusions_vcf == True + + + + output_fusion_bams == True + input_params['input_source'] == "use_fastq" @@ -471,7 +523,6 @@ - @@ -537,6 +588,28 @@ Arriba takes the main output file of STAR (Aligned.out.bam) as input (parameter -x). If STAR was run with the parameter --chimOutType WithinBAM, then this file contains all the information needed by Arriba to find fusions. When STAR was run with the parameter --chimOutType SeparateSAMold, the main output file lacks chimeric alignments. Instead, STAR writes them to a separate output file named Chimeric.out.sam. In this case, the file needs to be passed to Arriba via the parameter -c in addition to the main output file Aligned.out.bam. + STAR index create recommended parameter value: + + * --sjdbOverhang 250 + + + STAR recommended parameter values :: + + * --outSAMunmapped Within + * --outFilterMultimapNmax 50 + * --peOverlapNbasesMin 10 + * --alignSplicedMateMapLminOverLmate 0.5 + * --alignSJstitchMismatchNmax 5 -1 5 5 + * --chimSegmentMin 10 + * --chimOutType WithinBAM HardClip + * --chimJunctionOverhangMin 10 + * --chimScoreDropMax 30 + * --chimScoreJunctionNonGTAG 0 + * --chimScoreSeparation 1 + * --chimSegmentReadGapMax 3 + * --chimMultimapNmax 50 + + Arriba extracts three types of reads from the alignment file(s): * Split-reads, i.e., reads composed of segments which map in a non-linear way. STAR stores such reads as supplementary alignments.