Mercurial > repos > iuc > isoformswitchanalyzer
view isoformswitchanalyzer.xml @ 5:b3f292d9f35d draft
planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/isoformswitchanalyzer commit 7b7d9892618706dad95641831db8b9f83deb86e1
author | iuc |
---|---|
date | Fri, 02 Jun 2023 10:27:16 +0000 |
parents | 512d6462f2ce |
children |
line wrap: on
line source
<tool id="isoformswitchanalyzer" name="IsoformSwitchAnalyzeR" version="@TOOL_VERSION@+galaxy@SUFFIX_VERSION@"> <description>statistical identification of isoform switching</description> <macros> <import>macros.xml</import> </macros> <expand macro='xrefs'/> <expand macro='requirements'/> <stdio> <regex match="Execution halted" source="both" level="fatal" description="Execution halted." /> <regex match="Error in" source="both" level="fatal" description="An undefined error occurred, please check your input carefully and contact your administrator." /> <regex match="Fatal error" source="both" level="fatal" description="An undefined error occurred, please check your input carefully and contact your administrator." /> </stdio> <command><![CDATA[ #set $conditions = list() #set $sampleIDs = list() #set $replicates = list() #if $functionMode.selector == 'data_import' #if $functionMode.transcriptome.is_of_type("fasta.gz"): ln -s '${functionMode.transcriptome}' './transcriptome.fasta.gz' && #set $transcriptome = './transcriptome.fasta.gz' #else ln -s '${functionMode.transcriptome}' './transcriptome.fasta' && #set $transcriptome = './transcriptome.fasta' #end if #if $functionMode.genomeAnnotation.is_of_type("gtf.gz"): ln -s '${functionMode.genomeAnnotation}' './annotation.gtf.gz' && #set $annotation = './annotation.gtf.gz' #else ln -s '${functionMode.genomeAnnotation}' './annotation.gtf' && #set $annotation = './annotation.gtf' #end if mkdir -p 'input_files' && #if $functionMode.countFiles != 'disabled': mkdir -p "count_files/factor1/" && mkdir -p "count_files/factor2/" && #end if #if $functionMode.tool_source.selector == 'stringtie' #set $filename = 't_data.ctab' #if $functionMode.tool_source.novoisoforms.selector == 'novel' #if $functionMode.tool_source.novoisoforms.stringtieAnnotation.is_of_type("gtf.gz"): ln -s '${$functionMode.tool_source.novoisoforms.stringtieAnnotation}' './stringtie_annotation.gtf.gz' && #set $stringtie_annotation = './stringtie_annotation.gtf.gz' #else ln -s '${$functionMode.tool_source.novoisoforms.stringtieAnnotation}' './stringtie_annotation.gtf' && #set $stringtie_annotation = './stringtie_annotation.gtf' #end if #end if #else if $functionMode.tool_source.selector == 'salmon' #set $filename = 'quant.sf' #else #set $filename = 'abundance.tsv' #end if #for $index in range(len($functionMode.tool_source.first_factor.trans_counts)): $conditions.append($functionMode.tool_source.first_factor.factorLevel) $sampleIDs.append(str($functionMode.tool_source.first_factor.factorLevel) + str($index)) $replicates.append($index) mkdir './input_files/${functionMode.tool_source.first_factor.factorLevel}${index}/' && ln -s $functionMode.tool_source.first_factor.trans_counts[$index] './input_files/${functionMode.tool_source.first_factor.factorLevel}${index}/${filename}' && #end for #for $index in range(len($functionMode.tool_source.second_factor.trans_counts)): $conditions.append($functionMode.tool_source.second_factor.factorLevel) $sampleIDs.append(str($functionMode.tool_source.second_factor.factorLevel) + str($index)) $replicates.append($index) mkdir './input_files/${functionMode.tool_source.second_factor.factorLevel}${index}/' && ln -s $functionMode.tool_source.second_factor.trans_counts[$index] './input_files/${functionMode.tool_source.second_factor.factorLevel}${index}/${filename}' && #end for Rscript '${__tool_directory__}/IsoformSwitchAnalyzeR.R' #for $i, $condition in enumerate($conditions) --condition $condition --sampleID $sampleIDs[$i] --replicate $replicates[$i] #end for $functionMode.pairedSamples --modeSelector $functionMode.selector --parentDir './input_files' --annotation $annotation --transcriptome $transcriptome $functionMode.removeNonConvensionalChr --toolSource $functionMode.tool_source.selector #if $functionMode.tool_source.selector == 'stringtie' #if $functionMode.tool_source.novoisoforms.selector == 'novel' --stringtieAnnotation $stringtie_annotation #end if --readLength $functionMode.tool_source.averageSize $functionMode.tool_source.fixStringTieAnnotationProblem #end if --countFiles $functionMode.countFiles #else if $functionMode.selector == 'first_step' Rscript '${__tool_directory__}/IsoformSwitchAnalyzeR.R' --modeSelector $functionMode.selector --rObject $functionMode.robject --alpha $functionMode.alpha --dIFcutoff $functionMode.dIFcutoff $functionMode.onlySigIsoforms $functionMode.filterForConsequences --geneExpressionCutoff $functionMode.prefilter.geneExpressionCutoff --isoformExpressionCutoff $functionMode.prefilter.isoformExpressionCutoff --IFcutoff $functionMode.prefilter.IFcutoff $functionMode.prefilter.removeSingleIsformGenes $functionMode.prefilter.keepIsoformInAllConditions $functionMode.dexseq.correctForConfoundingFactors $functionMode.dexseq.overwriteIFvalues $functionMode.dexseq.reduceToSwitchingGenes $functionMode.dexseq.reduceFurtherToGenesWithConsequencePotential $functionMode.dexseq.keepIsoformInAllConditions --minORFlength $functionMode.novel_isoform.minORFlength --orfMethod $functionMode.novel_isoform.orfMethod --PTCDistance $functionMode.novel_isoform.PTCDistance $functionMode.extract_sequence.removeShortAAseq $functionMode.extract_sequence.removeLongAAseq $functionMode.extract_sequence.removeORFwithStop $functionMode.extract_sequence.onlySwitchingGenes #else #if $functionMode.protein_domains.selector == 'enabled' mkdir -p './pfam_files' && #for $index,$filename in enumerate($functionMode.protein_domains.analyzePFAM) ln -s $filename './pfam_files/dataset${index}.txt' && #end for #end if #if $functionMode.signal_peptides.selector == 'enabled' mkdir -p './signalp_files' && #for $index,$filename in enumerate($functionMode.signal_peptides.analyzeSignalP) ln -s $filename './signalp_files/dataset${index}.txt' && #end for #end if #if $functionMode.disordered_regions.selector == 'netsurfp' mkdir -p './netsurf_files' && #for $index,$filename in enumerate($functionMode.disordered_regions.analyzeNetSurfP2) ln -s $filename './netsurf_files/dataset${index}.txt' && #end for #end if Rscript '${__tool_directory__}/IsoformSwitchAnalyzeR.R' --modeSelector $functionMode.selector --rObject $functionMode.robject --analysisMode $functionMode.analysis_mode.selector --alpha $functionMode.analysis_mode.alpha --dIFcutoff $functionMode.analysis_mode.dIFcutoff #if $functionMode.analysis_mode.selector == 'top' --genesToPlot $functionMode.analysis_mode.n $functionMode.analysis_mode.advanced_options.filterForConsequences $functionMode.analysis_mode.advanced_options.sortByQvals $functionMode.analysis_mode.advanced_options.onlySigIsoforms $functionMode.analysis_mode.advanced_options.onlySwitchingGenes $functionMode.analysis_mode.advanced_options.countGenes $functionMode.analysis_mode.advanced_options.asFractionTotal $functionMode.analysis_mode.advanced_options.plotGenes $functionMode.analysis_mode.advanced_options.simplifyLocation $functionMode.analysis_mode.advanced_options.removeEmptyConsequences $functionMode.analysis_mode.advanced_options.removeEmptyConsequences #else --gene $functionMode.analysis_mode.gene --IFcutoff $functionMode.analysis_mode.advanced_options.IFcutoff $functionMode.analysis_mode.advanced_options.rescaleTranscripts $functionMode.analysis_mode.advanced_options.reverseMinus $functionMode.analysis_mode.advanced_options.addErrorbars $functionMode.analysis_mode.advanced_options.onlySwitchingGenes #end if #if $functionMode.coding_potential.selector == 'cpat' --pathToCPATresultFile $functionMode.coding_potential.analyzeCPAT --codingCutoff $functionMode.coding_potential.codingCutoff #else if $functionMode.coding_potential.selector == 'cpc2' --pathToCPC2resultFile $functionMode.coding_potential.analyzeCPC2 $functionMode.coding_potential.removeNoncodingORFs --codingCutoff $functionMode.coding_potential.codingCutoff #end if #if $functionMode.protein_domains.selector == 'enabled' --pathToPFAMresultFile './pfam_files' #end if #if $functionMode.signal_peptides.selector == 'enabled' --pathToSignalPresultFile './signalp_files' --minSignalPeptideProbability $functionMode.signal_peptides.minSignalPeptideProbability #end if #if $functionMode.disordered_regions.selector == 'netsurfp' --pathToNetSurfP2resultFile './netsurf_files' --smoothingWindowSize $functionMode.disordered_regions.smoothingWindowSize --probabilityCutoff $functionMode.disordered_regions.probabilityCutoff --minIdrSize $functionMode.disordered_regions.minIdrSize #else if $functionMode.disordered_regions.selector == 'iupred2a' --pathToIUPred2AresultFile $functionMode.disordered_regions.AanalyzeIUPred2A --smoothingWindowSize $functionMode.disordered_regions.smoothingWindowSize --probabilityCutoff $functionMode.disordered_regions.probabilityCutoff --minIdrSize $functionMode.disordered_regions.minIdrSize $functionMode.disordered_regions.annotateBindingSites --minIdrBindingSize $functionMode.disordered_regions.minIdrBindingSize --minIdrBindingOverlapFrac $functionMode.disordered_regions.minIdrBindingOverlapFrac #end if --ntCutoff $functionMode.analyzeSwitchConsequences.ntCutoff #if $functionMode.analyzeSwitchConsequences.ntFracCutoff --ntFracCutoff $functionMode.analyzeSwitchConsequences.ntFracCutoff #end if --ntJCsimCutoff $functionMode.analyzeSwitchConsequences.ntJCsimCutoff --AaCutoff $functionMode.analyzeSwitchConsequences.AaCutoff --AaFracCutoff $functionMode.analyzeSwitchConsequences.AaFracCutoff --AaJCsimCutoff $functionMode.analyzeSwitchConsequences.AaJCsimCutoff $functionMode.analyzeSwitchConsequences.removeNonConseqSwitches #if $functionMode.analysis_mode.selector == 'top' && mkdir -p './pdf_outputs/' && mv *pdf './pdf_outputs/' && mv *_vs_* gene_plots #end if #end if ]]></command> <inputs> <conditional name="functionMode"> <param name="selector" type="select" label="Tool function mode" help="The first step of a IsoformSwitchAnalyzeR workflow is to import and integrate the isoform quantification with its basic annotation. Once you have all the relevant data imported into R (IsoformSwitchAnalyzeR will also help you with that), the workflow for identification and analysis of isoform switches with functional consequences can be divided into two parts."> <option value="data_import">Import data</option> <option value="first_step">Analysis part one: Extract isoform switches and their sequences</option> <option value="second_step">Analysis part two: Plot all isoform switches and their annotation</option> </param> <when value="data_import"> <conditional name="tool_source"> <param name="selector" type="select" label="Quantification data source" help="IsoformSwitchAnalyzeR has different functions for importing data from different sources."> <option value="stringtie">StringTie</option> <option value="salmon">Salmon</option> <option value="kallisto">Kallisto</option> </param> <when value="salmon"> <expand macro="macro_inputs"/> </when> <when value="kallisto"> <expand macro="macro_inputs"/> </when> <when value="stringtie"> <expand macro="macro_inputs"/> <param name="averageSize" type="integer" min="0" value="150" label="Average read length" help="Must be the number of base pairs sequenced. e.g. if the data quantified is 75 bp paired ends the the user should supply readLength=75" /> <param argument="fixStringTieAnnotationProblem" type="boolean" truevalue="--fixStringTieAnnotationProblem" falsevalue="" checked="true" label="Fix StringTie annotation problem" help="This option will automatically try and correct some of the annoation problems created when doing transcript assembly (unassigned transcripts and merged genes)" /> <conditional name="novoisoforms"> <param name="selector" type="select" label="Analysis mode"> <option value="novel">Include novel isoforms in analysis</option> <option value="reference">Reference-only analysis</option> </param> <when value="novel"> <param name="stringtieAnnotation" type="data" format="gtf,gtf.gz" label="Annotation generated by StringTie merge" help="The merged GTF is used to recalculate expression estimates using the merged, novel transcripts." /> </when> <when value="reference"/> </conditional> </when> </conditional> <param name="genomeAnnotation" type="data" format="gtf,gtf.gz" label="Genome annotation" help="It is used to integrate the coding sequence (CDS) regions from in the GTF file as the ORF regions used by IsoformSwitchAnalyzeR." /> <param name="transcriptome" type="data" format="fasta,fasta.gz" label="Transcriptome" help="Please note this different from a fasta file with the sequences of the entire genome." /> <param argument="removeNonConvensionalChr" type="boolean" truevalue="--removeNonConvensionalChr" falsevalue="" checked="false" label="Remove non-conventional chromosomes" help="These regions are typically used to annotate regions that cannot be associated to a specific region." /> <param argument="pairedSamples" type="boolean" truevalue="--pairedSamples" falsevalue="" checked="false" label="Paired samples between factors" help="Samples from different factors belong to the same individual (e.g. samples from same patient from health and cancerous tissues or different parts from the same plant)" /> <param name="countFiles" type="select" label="Generate count matrix files" help="If IsoformSwitchAnalyzeR is used for fixing Stringtie annotation problem, it can generate count files for analyzing differential expression with DESeq2 (when selecting collection) or CEMiTool (when secting the expression matrix format)."> <option value="disabled">Disabled</option> <option value="collection">Collection of count files</option> <option value="matrix">Expression matrix</option> </param> </when> <!--WRAPPER FIRST STEP SECTION--> <when value="first_step"> <param name="robject" type="data" format="rdata" label="IsoformSwitchAnalyzeR R object" help="It is generated when running the analysis part 1." /> <expand macro="macro_alpha_difcutoff"/> <expand macro="macro_onlysigisoforms1"/> <param argument="filterForConsequences" type="boolean" truevalue="--filterForConsequences" falsevalue="" checked="false" label="Filter for consquences" help="Filter for genes with functional consequences. The output will then be the number of significant genes and isoforms originating from genes with predicted consequences" /> <section name="prefilter" title="Pre-filter parameters" help="SwitchAnalyzeR will remove genes/isoforms with the aim of allowing faster processing time as well as more trustworthy results."> <param argument="geneExpressionCutoff" type="float" min="0" value="1" label="Gene expression cutoff" help="The expression cutoff (most likely in TPM/RPKM/FPKM) which the average expression in BOTH condisions must be higher than." /> <param argument="isoformExpressionCutoff" type="float" min="0" value="0" label="Isoform expresion cutoff" help="The expression cutoff (most likely in RPKM/FPKM) which isoforms must be expressed more than, in at least one conditions of a comparison. Default is 0 (which removes completely unused isoforms)." /> <expand macro="macro_ifcutoff" value="0.01" help="The cutoff on isoform usage (measured as Isoform Fraction) which isoforms must be used more than in at least one conditions of a comparison" /> <param argument="removeSingleIsformGenes" type="boolean" truevalue="--removeSingleIsformGenes" falsevalue="" checked="true" label="Remove single isoform genes" help="Only keep genes containing more than one isoform (in any comparison, after the other filters have been applied)" /> <expand macro="macro_keeisoforminall" checked="false"/> </section> <section name="dexseq" title="DEXseq parameters" help="DEXSeq is used to test isoforms (isoform resolution) for differential isoform usage."> <param argument="correctForConfoundingFactors" type="boolean" truevalue="--correctForConfoundingFactors" falsevalue="" checked="true" label="Correct for confounding factors" help="A logic indicating whether IsoformSwitchAnalyzeR to use limma to correct for any confounding effects (e.g. batch effects) as indicated in the design matrix (as additional columns (apart from the two default columns)) " /> <param argument="overwriteIFvalues" type="boolean" truevalue="--overwriteIFvalues" falsevalue="" checked="true" label="Overwrite IF values" help="It indicates whether to overwrite the IF and dIF stored in the switchAnalyzeRlist with the corrected IF and dIF values - if no confounding effects are present in the design matrix this will not change anything" /> <param argument="reduceToSwitchingGenes" type="boolean" truevalue="--reduceToSwitchingGenes" falsevalue="" checked="true" label="Reduce to switch genes" help="Reduced to the genes which contains at least one isoform significantly differential used (as indicated by the alpha and dIFcutoff parameters" /> <param argument="reduceFurtherToGenesWithConsequencePotential" type="boolean" truevalue="--reduceFurtherToGenesWithConsequencePotential" falsevalue="" checked="false" label="Reduce to genes with consequence potential" help="This argument is a more strict version of reduceToSwitchingGenes as it not only requires that at least one isoform is significantly differential used (as indicated by the alpha and dIFcutoff parameters) but also that there is an isoform with the opposite effect size (e.g. used less if the first isoform is used more). The minimum effect size of the opposing isoform usage is also controlled by dIFcutoff. The existence of such an opposing isoform means a switch pair can be formed" /> <expand macro="macro_keeisoforminall" checked="true"/> </section> <section name="novel_isoform" title="Novel isoform analysis parameters" help="For the subset of isoforms not already annotated with ORFs this function predicts the most likely Open Reading Frame (ORF) and the NMD sensitivity. This function is made to help annotate isoforms if you have performed (guided) de-novo isoform reconstruction (isoform deconvolution)."> <param argument="minORFlength" type="integer" min="0" value="100" label="Minimum ORF length" help="The minimum size (in nucleotides) an ORF must be to be considered (and reported). Default is 100 nucleotides, which around 97.5% of Gencode coding isoforms in both human and mouse have." /> <param argument="orfMethod" type="select" label="ORF identification method" help="More information in the help section"> <option value="longest.AnnotatedWhenPossible">Longest and annotated when possible</option> <option value="longest">Longest</option> <option value="mostUpstream">Most upstream</option> <option value="longestAnnotated">Longest annotated</option> <option value="mostUpstreamAnnoated">Most upstream annotated</option> </param> <param argument="PTCDistance" type="integer" min="0" value="50" label="Maximal allowed premature termination codon-distance" help="The minimum distance (number of nucleotides) from the STOP codon to the final exon-exon junction. If the distance from the STOP to the final exon-exon junction is larger than this the isoform to be marked as NMD-sensitive. " /> </section> <section name="extract_sequence" title="Sequence extraction parameters" help="switchAnalyzeR will extracts the nucleotide (NT) sequence of transcripts by extracting and concatenating the sequences of a reference genome corresponding to the genomic coordinates of the isoforms. "> <expand macro="macro_onlyswitching"/> <param argument="removeShortAAseq" type="boolean" truevalue="--removeShortAAseq" falsevalue="" checked="true" label="Remove short aminoacid sequences" help="This option exist to allows for easier usage of the Pfam and SignalP web servers which both currently have restrictions on allowed sequence lengths. If enabled AA sequences are filtered to be > 5 AA. This will only affect the sequences written to the FASTA file not the sequences added to the switchAnalyzeRlist" /> <param argument="removeLongAAseq" type="boolean" truevalue="--removeLongAAseq" falsevalue="" checked="false" label="Remove long aminoacid sequences" help="A logical indicating whether to removesequences based on their length. This option exist to allows for easier usage of the Pfam and SignalP web servers which both currently have restrictions on allowed sequence lengths. If enabled AA sequences are filtered to be smaller 1000 AA. This will only affect the sequences written to the fasta file (if writeToFile=TRUE) not the sequences added to the switchAnalyzeRlist. " /> <param argument="removeORFwithStop" type="boolean" truevalue="--removeORFwithStop" falsevalue="" checked="true" label="Remove ORFs containint STOP codons" help="ORFs containing stop codons, defined as * when the ORF nucleotide sequences is translated to the amino acid sequence, should be A) removed from the ORF annotation in the switchAnalyzeRlist and B) removed from the sequences added to the switchAnalyzeRlist and/or written to FASTA files. This is only necessary if you are analyzing quantified known annotated data where you supplied a GTF file to the import function" /> </section> <param name="outputs_first" type="select" display="checkboxes" multiple="true" label="Outputs selector"> <option value="nt" selected="true">Nucleotide sequences</option> <option value="aa" selected="true">Aminoacid sequences</option> <option value="summary" selected="true">Gene switch summary</option> </param> </when> <!-- WRAPPER SECOND STEP SECTION--> <when value="second_step"> <param name="robject" type="data" format="rdata" label="IsoformSwitchAnalyzeR R object" help="It is generated when running the analysis part 2." /> <conditional name="analysis_mode"> <param name="selector" type="select" label="Analysis mode" help="This selector allows so specify if you want to analyze a specific gene or the (top) switching genes/isoforms "> <option value="top" selected="true">Full analysis</option> <option value="single">Analyze specific gene</option> </param> <when value="top"> <expand macro="macro_alpha_difcutoff"/> <param argument="n" type="integer" min="1" value="10" label="Number of top switching features (genes/isoforms) to plot" help="This parameters allows to specify the number of top genes/isoforms to plot"/> <section name="advanced_options" title="Full analysis advanced options"> <param argument="filterForConsequences" type="boolean" truevalue="--filterForConsequences" falsevalue="" checked="false" label="Filter genes with functional consequences" help="The output will then be the number of significant genes and isoforms originating from genes with predicted consequences"/> <param argument="sortByQvals" type="boolean" truevalue="--sortByQvals" falsevalue="" checked="true" label="Sorting mode" help="A logic indicating whether to the top n features are sorted by decreasing significance (increasing q-values) (if enabled) or decreasing switch size (absolute dIF, which are still significant as defined by alpha) (if disabled). The dIF values for genes are considered as the total change within the gene calculated as sum(abs(dIF)) for each gene" /> <expand macro="macro_onlysigisoforms2"/> <expand macro="macro_onlyswitching"/> <param argument="countGenes" type="boolean" truevalue="--countGenes" falsevalue="" checked="true" label="Number genes or isoform switches counts" help="This parameter indicates whether it is the number of genes (if enabled) or isoform switches (if disabled) which primary result in gain/loss that are counted" /> <param argument="asFractionTotal" type="boolean" truevalue="--asFractionTotal" falsevalue="" checked="false" label="Summary as numbers of as fraction" help="The consequences/splicing events should be summarized calculated as numbers (if disabled) or as a fraction of the total number of switches/genes" /> <param argument="plotGenes" type="boolean" truevalue="--plotGenes" falsevalue="" checked="false" label="Plot number/fraction of genes or isoforms" help="Plot the number/fraction of genes with (if enabled) or isoforms (if disabled) involved with isoform switches with functional consequences (both filtered via alpha and dIFcutoff)" /> <param argument="simplifyLocation" type="boolean" truevalue="--simplifyLocation" falsevalue="" checked="true" label="Simplify location" help="Simplify the switches involved in changes in subcellular localizations (due the the hundreds of possible combinations)" /> <param argument="removeEmptyConsequences" type="boolean" truevalue="--removeEmptyConsequences" falsevalue="" checked="false" label="Remove empty consequences" help="Remove consequenses analyzed but where no differences was found (those showing zero in the plot)" /> <param argument="analysisOppositeConsequence" type="boolean" truevalue="--analysisOppositeConsequence" falsevalue="" checked="false" label="Analysis opposite consequences in enrichment analysis" help="Reverse the analysis meaning if 'Domain gains' are analyze will case the analysis to be performed on 'Domain loss'. The main effect is for the visual appearance of plot which will be mirrored (around the 0.5 fraction)" /> </section> </when> <when value="single"> <param argument="gene" type="text" value="" label="Gene name" help="Either the gene_id or the gene name of the gene to plot"> <sanitizer invalid_char=""> <valid initial="string.letters,string.digits"> <add value="_" /> <add value="-" /> </valid> </sanitizer> <validator type="regex">[0-9a-zA-Z_-]+</validator> </param> <expand macro="macro_alpha_difcutoff"/> <section name="advanced_options" title="Single gene mode advanced options"> <expand macro="macro_ifcutoff" value="0.05" help="The cutoff used for the minimum contribution to gene expression (in at least one condition) for an isoforms must have to be plotted (measured as Isoform Fraction (IF) values)" /> <param argument="rescaleTranscripts" type="boolean" truevalue="--rescaleTranscripts" falsevalue="" checked="true" label="Rescale transcripts" help="All the isoforms should be resealed to the square root of their original sizes. This feature is implemented because introns usually are much larger than exons making it difficult to see structural changes. This is very useful for structural visualization but the scaling might distort actual intron and exon sizes" /> <param argument="reverseMinus" type="boolean" truevalue="--reverseMinus" falsevalue="" checked="true" label="Isoforms on minus strand should be inverted" help="Isoforms on minus strand should be inverted so they are visualized as going from left to right instead of right to left" /> <param argument="addErrorbars" type="boolean" truevalue="--addErrorbars" falsevalue="" checked="true" label="Add error bars" help="Error bars should be added to the expression plots to show uncertainty in estimates" /> <expand macro="macro_onlyswitching"/> </section> </when> </conditional> <conditional name="coding_potential"> <param name="selector" type="select" label="Include prediction of coding potential information" help="Integrate in the analysis de output from CPAT or CPC2."> <option value="disabled">Disabled</option> <option value="cpat">CPAT</option> <option value="cpc2">CPC2</option> </param> <when value="disabled"/> <when value="cpat"> <param argument="analyzeCPAT" type="data" format="txt" label="CPAT result file" help=" Use default parameters and the nucleotide fasta file (_nt.fasta). If the webserver was used, download the tab-delimited result file (available at the bottom of the result page). If a stand-alone version was used, just supply the path to the result file" /> <param argument="codingCutoff" type="float" min="0" max="1" value="0.725" label="Coding cutoff" help="cutoff used by CPAT for distinguishing between coding and non-coding transcripts. The cutoff is dependent on species analyzed. IsoformSwitchAnalyzerR developers suggest that the optimal cutoff for overlapping coding and noncoding isoforms are 0.725 for human and 0.721 for mouse. However the suggested cutoffs from the CPAT develpers derived by comparing known genes to random non-coding regions of the genome is 0.364 for human and 0.44 for mouse" /> </when> <when value="cpc2"> <param argument="analyzeCPC2" type="data" format="txt" label="CPC2 result file" help="Use default parameters and if required select the most similar species. If the webserver (batch submission) was used, download the tab-delimited result file (via the “Download the result” button). If a stand-alone version was just just supply the path to the result file" /> <param argument="removeNoncodingORFs" type="boolean" truevalue="--removeNoncodingORFs" falsevalue="" checked="false" label="Remove non-coding ORFs" help="Remove ORF information from the isoforms which the CPC2 analysis classifies as non-coding. This can be particular useful if the isoform (and ORF) was predicted de-novo but is not recommended if ORFs was imported from a GTF file" /> <param argument="codingCutoff" type="float" min="0" max="1" value="0.5" label="Coding cutoff" help="Numeric indicating the cutoff used by CPC2 for distinguishing between coding and non-coding transcripts. The cutoff appears to be species independent." /> </when> </conditional> <conditional name="protein_domains"> <param name="selector" type="select" label="Include Pfam information" help="Pfam is a database of protein families that includes their annotations and multiple sequence alignments generated using hidden Markov models."> <option value="disabled">Disabled</option> <option value="enabled">Enabled</option> </param> <when value="disabled"/> <when value="enabled"> <param argument="analyzePFAM" type="data" format="txt" multiple="true" optional="true" label="Include Pfam results (sequence analysis of protein domains)" help="Use default parameters and the amino acid fasta file (_AA.fasta). If the webserver is used you need to copy/paste the result part of the mail you receive into an empty plain text document (notepad, sublimetext, TextEdit or similar (not Word)) and save that to a plain text (txt) file. The path to that file should be supplied. If a stand-alone version was used, just supply the path to the result file" /> </when> </conditional> <conditional name="signal_peptides"> <param name="selector" type="select" label="Include SignalP results" help="Integration of the result of SignalP (external sequence analysis of signal peptides)"> <option value="disabled">Disabled</option> <option value="enabled">Enabled</option> </param> <when value="disabled"/> <when value="enabled"> <param argument="analyzeSignalP" type="data" format="txt" multiple="true" optional="true" label="SignalP" help="Use the amino acid fasta file (_AA.fasta). If using the webserver SignalP should be run with the parameter 'Short output (no figures)' under 'Output format' and one should select the appropriate 'Organism group'. When using a stand-alone version SignalP should be run with the '-f summary' option. If using the webserver the results can be downloaded using the 'Downloads' button in the top-right corner where the user should select 'Prediction summary' and supply the path to the resulting file to the 'pathToSignalPresultFile' argument. If a stand-alone version was just supply the path to the summary result file." /> <param argument="minSignalPeptideProbability" type="float" min="0" max="1" value="0.5" label="Minimum probability for calling a signal peptide"/> </when> </conditional> <conditional name="disordered_regions"> <param name="selector" type="select" label="Include prediction of intrinsically disordered Regions (IDR) information" help="Integrate in the analysis de output from IUPred2A or NetSurfP-2"> <option value="disabled">Disabled</option> <option value="iupred2a">IUPred2A</option> <option value="netsurfp">NetSurfP-2</option> </param> <when value="disabled"/> <when value="iupred2a"> <param argument="AanalyzeIUPred2A" type="data" format="txt,gz" label="IUPred2A result file" help="Can be gziped. If multiple result files were created (multiple web-server runs) just supply all of them." /> <expand macro="macro_disordered_regions"/> <param argument="annotateBindingSites" type="boolean" truevalue="--annotateBindingSites" falsevalue="" checked="true" label="Annotate binding sites" help="Integrate the ANCHOR2 prediction of Intrinsically Disordered Binding Regions (IDBRs)" /> <param argument="minIdrBindingSize" type="integer" min="0" value="15" label="Minimum IDBR binding size" help="How long a stretch of binding site the region part of the Intrinsically Disordered Binding Regions (IDBR)" /> <param argument="minIdrBindingOverlapFrac" type="float" min="0" value="0.8" label="Minimum fraction of a predicted IDBR" help="Minimum fraction of a predicted IDBR must also be within a IDR before the IDR is considered as a an IDR with a binding region" /> </when> <when value="netsurfp"> <param argument="analyzeNetSurfP2" type="data" format="txt,gz" multiple="true" label="NetSurfP-2 result file" help="Can be gziped. If multiple result files were created (multiple web-server runs) just supply all of them." /> <expand macro="macro_disordered_regions"/> </when> </conditional> <section name="analyzeSwitchConsequences" title="Analyze switch consequences parameters"> <param argument="ntCutoff" type="integer" min="0" value="50" label="Nucleotide length cutoff" help="The length difference (in nucleotides) a comparison must be larger than for reporting differences" /> <param argument="ntFracCutoff" type="float" min="0" max="1" optional="true" label="Nucleotide length fraction cutoff" help="The cutoff in length difference, measured as a fraction of the length of the downregulated isoform, a comparison must be larger than for reporting differences. For example does 0.05 mean the upregulated isoform must be 5% longer/shorter before it is reported. " /> <param argument="ntJCsimCutoff" type="float" min="0" max="1" value="0.8" label="Cutoff on Jaccard similarity between the overlap of two nucloetide sequences" help=" If the measured JCsim is smaller than this cutoff the sequences are considered different and reported as such" /> <param argument="AaCutoff" type="integer" min="0" value="10" label="Aminoacid lenght cutoff" help="Length difference (in AA) a comparison must be larger than for reporting differences when evaluating ’ORF_seq_similarity’, primarily implemented to avoid differences in very short AA sequences being classified as different" /> <param argument="AaFracCutoff" type="float" min="0" max="1" value="0.5" label="Aminoacid length fraction cutoff" help="Cutoff of length difference of the protein domain or IDR. The difference is measured as a fraction of the longest region, a comparison must be larger than before reporting it" /> <param argument="AaJCsimCutoff" type="float" min="0" max="1" value="0.9" label="Cutoff between the overlap of two aminoacid sequences" help="If the measured JCsim is smaller than this cutoff the sequences are considered different and reported as such" /> <param argument="removeNonConseqSwitches" type="boolean" truevalue="--removeNonConseqSwitches" falsevalue="" checked="true" label="Remove the comparison of isoforms where no consequences were found" /> </section> </when> </conditional> </inputs> <outputs> <collection name="collection_counts_factor1" type="list" label="${tool.name} on ${on_string}: gene counts factor1"> <discover_datasets pattern="__designation_and_ext__" format="tabular" directory="count_files/factor1" /> <filter>functionMode['selector'] == 'data_import'</filter> <filter>functionMode['countFiles'] == 'collection'</filter> </collection> <collection name="collection_counts_factor2" type="list" label="${tool.name} on ${on_string}: gene counts factor2"> <discover_datasets pattern="__designation_and_ext__" format="tabular" directory="count_files/factor2" /> <filter>functionMode['selector'] == 'data_import'</filter> <filter>functionMode['countFiles'] == 'collection'</filter> </collection> <data name="matrix_counts" format="tabular" from_work_dir="count_files/matrix.tabular" label="${tool.name} on ${on_string}: gene counts matrix"> <filter>functionMode['selector'] == 'data_import'</filter> <filter>functionMode['countFiles'] == 'matrix'</filter> </data> <data name="sample_annotation" format="tabular" from_work_dir="count_files/samples.tabular" label="${tool.name} on ${on_string}: samples annotation"> <filter>functionMode['selector'] == 'data_import'</filter> <filter>functionMode['countFiles'] == 'matrix'</filter> </data> <data name="switchList" format="rdata" from_work_dir="SwitchList.Rda" label="${tool.name} on ${on_string}: SwitchList (RData)"/> <data name="isoformAA" format="fasta" from_work_dir="isoformSwitchAnalyzeR_isoform_AA.fasta" label="${tool.name} on ${on_string}: aminoacid sequences"> <filter>functionMode['selector'] == 'first_step'</filter> <filter>functionMode['outputs_first'] and 'aa' in functionMode['outputs_first']</filter> </data> <data name="isoformNT" format="fasta" from_work_dir="isoformSwitchAnalyzeR_isoform_nt.fasta" label="${tool.name} on ${on_string}: nucleotide sequences"> <filter>functionMode['selector'] == 'first_step'</filter> <filter>functionMode['outputs_first'] and 'nt' in functionMode['outputs_first']</filter> </data> <data name="switchSummary" format="tabular" from_work_dir="switchSummary.tsv" label="${tool.name} on ${on_string}: summary"> <filter>functionMode['selector'] == 'first_step'</filter> <filter>functionMode['outputs_first'] and 'summary' in functionMode['outputs_first']</filter> </data> <collection name="plots_summary" type="list" label="${tool.name} on ${on_string}: genome wide plots"> <discover_datasets pattern="__designation_and_ext__" format="pdf" directory="pdf_outputs" /> <filter>functionMode['selector'] == 'second_step'</filter> <filter>functionMode['analysis_mode']['selector'] == 'top'</filter> </collection> <collection name="genes_consequences" type="list" label="${tool.name} on ${on_string}: isoform switches with predicted functional consequences plots"> <discover_datasets pattern="__designation_and_ext__" format="pdf" directory="gene_plots/with_consequences" /> <filter>functionMode['selector'] == 'second_step'</filter> <filter>functionMode['analysis_mode']['selector'] == 'top'</filter> </collection> <collection name="genes_wo_consequences" type="list" label="${tool.name} on ${on_string}: isoform switches without predicted functional consequences plots"> <discover_datasets pattern="__designation_and_ext__" format="pdf" directory="gene_plots/without_consequences" /> <filter>functionMode['selector'] == 'second_step'</filter> <filter>functionMode['analysis_mode']['selector'] == 'top'</filter> </collection> <data name="mostSwitching" format="tabular" from_work_dir="mostSwitchingGene.tsv" label="${tool.name} on ${on_string}: switching gene/isoforms"> <filter>functionMode['selector'] == 'second_step'</filter> <filter>functionMode['analysis_mode']['selector'] == 'top'</filter> </data> <data name="consequencesSummary" format="tabular" from_work_dir="consequencesSummary.tsv" label="${tool.name} on ${on_string}: consequences summary"> <filter>functionMode['selector'] == 'second_step'</filter> <filter>functionMode['analysis_mode']['selector'] == 'top'</filter> </data> <data name="consequencesEnrichment" format="tabular" from_work_dir="consequencesEnrichment.tsv" label="${tool.name} on ${on_string}: consequences enrichment"> <filter>functionMode['selector'] == 'second_step'</filter> <filter>functionMode['analysis_mode']['selector'] == 'top'</filter> </data> <data name="splicingSummary" format="tabular" from_work_dir="splicingSummary.tsv" label="${tool.name} on ${on_string}: splicing summary"> <filter>functionMode['selector'] == 'second_step'</filter> <filter>functionMode['analysis_mode']['selector'] == 'top'</filter> </data> <data name="splicingEnrichment" format="tabular" from_work_dir="splicingEnrichment.tsv" label="${tool.name} on ${on_string}: splicing enrichment"> <filter>functionMode['selector'] == 'second_step'</filter> <filter>functionMode['analysis_mode']['selector'] == 'top'</filter> </data> <data name="splicing_fulldata" format="tabular" from_work_dir="switchSplicing_fulldata.tsv" label="${tool.name} on ${on_string}: alternative splicing full data"> <filter>functionMode['selector'] == 'second_step'</filter> <filter>functionMode['analysis_mode']['selector'] == 'top'</filter> </data> <data name="consequences_fulldata" format="tabular" from_work_dir="switchConsequence_fulldata.tsv" label="${tool.name} on ${on_string}: funcional consequences full data"> <filter>functionMode['selector'] == 'second_step'</filter> <filter>functionMode['analysis_mode']['selector'] == 'top'</filter> </data> <data name="isoformFeatures" format="tabular" from_work_dir="IsoformFeatures.tsv" label="${tool.name} on ${on_string}: isoform features"> <filter>functionMode['selector'] == 'second_step'</filter> <filter>functionMode['analysis_mode']['selector'] == 'top'</filter> </data> <data name="single_gene" format="pdf" from_work_dir="single_gene.pdf" label="${tool.name} on ${on_string}: single gene analysis"> <filter>functionMode['selector'] == 'second_step'</filter> <filter>functionMode['analysis_mode']['selector'] == 'single'</filter> </data> </outputs> <tests> <!-- Test 01: Data import mode--> <test expect_num_outputs="1"> <conditional name="functionMode"> <param name="selector" value="data_import"/> <param name="genomeAnnotation" value="annotation_salmon.gtf.gz"/> <param name="transcriptome" value="transcriptome.fasta.gz"/> <param name="countFiles" value="disabled"/> <conditional name="tool_source"> <param name="selector" value="salmon"/> <section name="first_factor"> <param name="factorLevel" value="health"/> <param name="trans_counts" value="salmon_cond1_rep1.sf,salmon_cond1_rep2.sf"/> </section> <section name="second_factor"> <param name="factorLevel" value="cancer"/> <param name="trans_counts" value="salmon_cond2_rep1.sf,salmon_cond2_rep2.sf"/> </section> </conditional> </conditional> <output name="switchList" file="test01.RData" ftype="rdata" compare="sim_size" delta="100"/> </test> <!-- Test 02: Data import mode generate expression matrix--> <test expect_num_outputs="3"> <conditional name="functionMode"> <param name="selector" value="data_import"/> <param name="genomeAnnotation" value="annotation_salmon.gtf.gz"/> <param name="transcriptome" value="transcriptome.fasta.gz"/> <param name="countFiles" value="matrix"/> <conditional name="tool_source"> <param name="selector" value="salmon"/> <section name="first_factor"> <param name="factorLevel" value="health"/> <param name="trans_counts" value="salmon_cond1_rep1.sf,salmon_cond1_rep2.sf"/> </section> <section name="second_factor"> <param name="factorLevel" value="cancer"/> <param name="trans_counts" value="salmon_cond2_rep1.sf,salmon_cond2_rep2.sf"/> </section> </conditional> </conditional> <output name="switchList" ftype="rdata"> <assert_contents> <has_size value="652170" delta="300"/> </assert_contents> </output> <output name="matrix_counts" file="test02_counts.tabular" ftype="tabular" lines_diff="6"/> <output name="sample_annotation" file="test02_samples_annotation.tabular" ftype="tabular"/> </test> <!-- Test 03: Data import mode generate collection count files--> <test expect_num_outputs="3"> <conditional name="functionMode"> <param name="selector" value="data_import"/> <param name="genomeAnnotation" value="annotation_salmon.gtf.gz"/> <param name="transcriptome" value="transcriptome.fasta.gz"/> <param name="countFiles" value="collection"/> <conditional name="tool_source"> <param name="selector" value="salmon"/> <section name="first_factor"> <param name="factorLevel" value="health"/> <param name="trans_counts" value="salmon_cond1_rep1.sf,salmon_cond1_rep2.sf"/> </section> <section name="second_factor"> <param name="factorLevel" value="cancer"/> <param name="trans_counts" value="salmon_cond2_rep1.sf,salmon_cond2_rep2.sf"/> </section> </conditional> </conditional> <output name="switchList" ftype="rdata"> <assert_contents> <has_size value="652170" delta="300"/> </assert_contents> </output> <output_collection name="collection_counts_factor1" type="list" count="2"> <element name="health0_dataset" file="test03_health_counts.tabular" ftype="tabular" lines_diff="6"/> </output_collection> <output_collection name="collection_counts_factor2" type="list" count="2"> <element name="cancer0_dataset" file="test03_cancer_counts.tabular" ftype="tabular" lines_diff="6"/> </output_collection> </test> <!-- Test 04: Extract isoform switches all outputs--> <test expect_num_outputs="4"> <conditional name="functionMode"> <param name="selector" value="first_step"/> <param name="robject" value="test01.RData"/> <param name="alpha" value="0.05"/> <param name="dIFcutoff" value="0.1"/> <param name="onlySigIsoforms" value="false"/> <param name="filterForConsequences" value="false"/> <param name="outputs_first" value="nt,aa,summary"/> <section name="prefilter"> <param name="geneExpressionCutoff" value="1"/> <param name="isoformExpressionCutoff" value="0"/> <param name="IFcutoff" value="0.01"/> <param name="removeSingleIsformGenes" value="true"/> <param name="keepIsoformInAllConditions" value="false"/> </section> <section name="dexseq"> <param name="correctForConfoundingFactors" value="true"/> <param name="overwriteIFvalues" value="true"/> <param name="reduceToSwitchingGenes" value="true"/> <param name="reduceFurtherToGenesWithConsequencePotential" value="false"/> <param name="keepIsoformInAllConditions" value="true"/> </section> <section name="novel_isoform"> <param name="minORFlength" value="100"/> <param name="orfMethod" value="longest.AnnotatedWhenPossible"/> <param name="PTCDistance" value="50"/> </section> <section name="extract_sequence"> <param name="onlySwitchingGenes" value="true"/> <param name="removeShortAAseq" value="true"/> <param name="removeLongAAseq" value="false"/> <param name="removeORFwithStop" value="true"/> </section> </conditional> <output name="switchList" file="test04.RData" ftype="rdata" compare="sim_size" delta="100"/> <output name="isoformAA" ftype="fasta"> <assert_contents> <has_size value="138275" delta="300"/> <has_text text=">TCONS_00000007"/> <has_text text="MLLPPGSLSRPRTFSSQPLQT"/> </assert_contents> </output> <output name="isoformNT" ftype="fasta"> <assert_contents> <has_size value="780375" delta="300"/> <has_text text=">TCONS_00000007"/> <has_text text="GGGTCTCCCTCTGTTGTCCAAGGC"/> </assert_contents> </output> <output name="switchSummary" file="test04_summary.tabular" ftype="tabular"/> </test> <!-- Test 05: Extract isoform switches alternative parameters--> <test expect_num_outputs="1"> <conditional name="functionMode"> <param name="selector" value="first_step"/> <param name="robject" value="test01.RData"/> <param name="outputs_first" value=""/> <section name="dexseq"> <param name="correctForConfoundingFactors" value="true"/> <param name="overwriteIFvalues" value="true"/> <param name="reduceToSwitchingGenes" value="true"/> <param name="reduceFurtherToGenesWithConsequencePotential" value="true"/> <param name="keepIsoformInAllConditions" value="true"/> </section> <section name="novel_isoform"> <param name="orfMethod" value="mostUpstream"/> </section> </conditional> <output name="switchList" ftype="rdata"> <assert_contents> <has_size value="500518" delta="300"/> </assert_contents> </output> </test> <!--Test 06: generate plots and summaries full analsys--> <test expect_num_outputs="12"> <conditional name="functionMode"> <param name="selector" value="second_step"/> <param name="robject" value="test04.RData"/> <section name="analyzeSwitchConsequences"> <param name="ntCutoff" value="50"/> <param name="ntJCsimCutoff" value="0.8"/> <param name="AaCutoff" value="10"/> <param name="AaFracCutoff" value="0.5"/> <param name="AaJCsimCutoff" value="0.9"/> <param name="removeNonConseqSwitches" value="true"/> </section> <conditional name="analysis_mode"> <param name="selector" value="top"/> <param name="alpha" value="0.05"/> <param name="dIFcutoff" value="0.1"/> <param name="n" value="2"/> <section name="advanced_options"> <param name="filterForConsequences" value="false"/> <param name="sortByQvals" value="true"/> <param name="onlySigIsoforms" value="false"/> <param name="onlySwitchingGenes" value="true"/> <param name="countGenes" value="true"/> <param name="asFractionTotal" value="false"/> <param name="plotGenes" value="false"/> <param name="simplifyLocation" value="true"/> <param name="removeEmptyConsequences" value="false"/> <param name="analysisOppositeConsequence" value="false"/> </section> </conditional> </conditional> <output name="switchList" ftype="rdata"> <assert_contents> <has_size value="531580" delta="300"/> </assert_contents> </output> <output_collection name="plots_summary" type="list" count="7"> <element name="consequencesEnrichment" ftype="pdf"> <assert_contents> <has_size value="5995" delta="300"/> </assert_contents> </element> <element name="extractConsequencesSummary" ftype="pdf"> <assert_contents> <has_size value="5681" delta="300"/> </assert_contents> </element> <element name="splicingEnrichment" ftype="pdf"> <assert_contents> <has_size value="6361" delta="300"/> </assert_contents> </element> <element name="splicingGenomewide" ftype="pdf"> <assert_contents> <has_size value="165752" delta="300"/> </assert_contents> </element> <element name="splicingSummary" ftype="pdf"> <assert_contents> <has_size value="5990" delta="300"/> </assert_contents> </element> <element name="switchGene" ftype="pdf"> <assert_contents> <has_size value="18432" delta="300"/> </assert_contents> </element> <element name="volcanoPlot" ftype="pdf"> <assert_contents> <has_size value="20147" delta="300"/> </assert_contents> </element> </output_collection> <output_collection name="genes_consequences" type="list" count="2"> <element name="1_switch_plot_NADK_aka_NADK" ftype="pdf"> <assert_contents> <has_size value="8716" delta="300"/> </assert_contents> </element> <element name="2_switch_plot_PRKCZ_aka_PRKCZ" ftype="pdf"> <assert_contents> <has_size value="8463" delta="300"/> </assert_contents> </element> </output_collection> <output_collection name="genes_wo_consequences" type="list" count="2"> <element name="1_switch_plot_CLSTN1_aka_CLSTN1" ftype="pdf"> <assert_contents> <has_size value="8039" delta="300"/> </assert_contents> </element> <element name="2_switch_plot_ZBTB40_aka_ZBTB40" ftype="pdf"> <assert_contents> <has_size value="7506" delta="300"/> </assert_contents> </element> </output_collection> <output name="mostSwitching" file="test06_switching.tabular" ftype="tabular" lines_diff="4"/> <output name="consequencesSummary" file="test06_consequences_summary.tabular" ftype="tabular" lines_diff="4"/> <output name="consequencesEnrichment" file="test06_consequences_enrichment.tabular" ftype="tabular" lines_diff="4"/> <output name="splicingSummary" file="test06_splicing_summary.tabular" ftype="tabular" lines_diff="4"/> <output name="splicingEnrichment" file="test06_splicing_enrichment.tabular" ftype="tabular" lines_diff="4"/> <output name="consequences_fulldata" ftype="tabular"> <assert_contents> <has_size value="51508" delta="100"/> <has_text text="ORF_genomic"/> </assert_contents> </output> <output name="splicing_fulldata" ftype="tabular"> <assert_contents> <has_size value="46951" delta="100"/> <has_text text="ATTS_genomic_start"/> </assert_contents> </output> <output name="isoformFeatures" ftype="tabular"> <assert_contents> <has_size value="94888" delta="100"/> <has_text text="gene_overall_mean"/> </assert_contents> </output> </test> <!--Test 07: generate plots and summaries full analsys all inputs--> <test expect_num_outputs="12"> <conditional name="functionMode"> <param name="selector" value="second_step"/> <param name="robject" value="test04.RData"/> <section name="analyzeSwitchConsequences"> <param name="ntCutoff" value="20"/> <param name="ntJCsimCutoff" value="0.5"/> <param name="AaCutoff" value="10"/> <param name="AaFracCutoff" value="0.4"/> <param name="AaJCsimCutoff" value="0.8"/> <param name="removeNonConseqSwitches" value="false"/> </section> <conditional name="analysis_mode"> <param name="selector" value="top"/> <param name="alpha" value="0.05"/> <param name="dIFcutoff" value="0.1"/> <param name="n" value="2"/> <section name="advanced_options"> <param name="filterForConsequences" value="false"/> <param name="sortByQvals" value="true"/> <param name="onlySigIsoforms" value="false"/> <param name="onlySwitchingGenes" value="true"/> <param name="countGenes" value="true"/> <param name="asFractionTotal" value="false"/> <param name="plotGenes" value="false"/> <param name="simplifyLocation" value="true"/> <param name="removeEmptyConsequences" value="false"/> <param name="analysisOppositeConsequence" value="false"/> </section> </conditional> <conditional name="coding_potential"> <param name="selector" value="cpc2"/> <param name="analyzeCPC2" value="cpc2_result.txt"/> <param name="removeNoncodingORFs" value="false"/> <param name="codingCutoff" value="0.5"/> </conditional> <conditional name="protein_domains"> <param name="selector" value="enabled"/> <param name="analyzePFAM" value="pfam_results.txt"/> </conditional> <conditional name="signal_peptides"> <param name="selector" value="enabled"/> <param name="analyzeSignalP" value="signalP_results.txt"/> <param name="minSignalPeptideProbability" value="0.5"/> </conditional> <conditional name="disordered_regions"> <param name="selector" value="iupred2a"/> <param name="AanalyzeIUPred2A" value="iupred2a_result.txt.gz"/> <param name="smoothingWindowSize" value="5"/> <param name="probabilityCutoff" value="0.5"/> <param name="minIdrSize" value="30"/> <param name="annotateBindingSites" value="true"/> <param name="minIdrBindingSize" value="15"/> <param name="minIdrBindingOverlapFrac" value="0.8"/> </conditional> </conditional> <output name="switchList" ftype="rdata"> <assert_contents> <has_size value="542120" delta="300"/> </assert_contents> </output> <output_collection name="plots_summary" type="list" count="7"> <element name="consequencesEnrichment" ftype="pdf"> <assert_contents> <has_size value="5995" delta="300"/> </assert_contents> </element> <element name="extractConsequencesSummary" ftype="pdf"> <assert_contents> <has_size value="6617" delta="300"/> </assert_contents> </element> <element name="splicingEnrichment" ftype="pdf"> <assert_contents> <has_size value="6361" delta="300"/> </assert_contents> </element> <element name="splicingGenomewide" ftype="pdf"> <assert_contents> <has_size value="165752" delta="300"/> </assert_contents> </element> <element name="splicingSummary" ftype="pdf"> <assert_contents> <has_size value="5990" delta="300"/> </assert_contents> </element> <element name="switchGene" ftype="pdf"> <assert_contents> <has_size value="18432" delta="300"/> </assert_contents> </element> <element name="volcanoPlot" ftype="pdf"> <assert_contents> <has_size value="20147" delta="300"/> </assert_contents> </element> </output_collection> <output_collection name="genes_consequences" type="list" count="2"> <element name="1_switch_plot_NADK_aka_NADK" ftype="pdf"> <assert_contents> <has_size value="8716" delta="300"/> </assert_contents> </element> <element name="2_switch_plot_PRKCZ_aka_PRKCZ" ftype="pdf"> <assert_contents> <has_size value="8463" delta="300"/> </assert_contents> </element> </output_collection> <output_collection name="genes_wo_consequences" type="list" count="2"> <element name="1_switch_plot_CLSTN1_aka_CLSTN1" ftype="pdf"> <assert_contents> <has_size value="8559" delta="300"/> </assert_contents> </element> <element name="2_switch_plot_ZBTB40_aka_ZBTB40" ftype="pdf"> <assert_contents> <has_size value="8051" delta="300"/> </assert_contents> </element> </output_collection> <output name="mostSwitching" ftype="tabular"> <assert_contents> <has_size value="4062" delta="50"/> <has_text text="RPL11"/> </assert_contents> </output> <output name="consequencesSummary" ftype="tabular"> <assert_contents> <has_size value="1192" delta="50"/> <has_text text="nrGenesWithConsequences"/> </assert_contents> </output> <output name="consequencesEnrichment" ftype="tabular"> <assert_contents> <has_size value="1432" delta="50"/> <has_text text="NMD insensitive (paired with NMD sensitive"/> </assert_contents> </output> <output name="splicingSummary" ftype="tabular"> <assert_contents> <has_size value="892" delta="50"/> <has_text text="MEE in isoform used less"/> </assert_contents> </output> <output name="splicingEnrichment" ftype="tabular"> <assert_contents> <has_size value="1157" delta="50"/> <has_text text="A5 gain (paired with A5 loss)"/> </assert_contents> </output> <output name="consequences_fulldata" ftype="tabular"> <assert_contents> <has_size value="103581" delta="50"/> <has_text text="signal_peptide_identified"/> </assert_contents> </output> <output name="splicing_fulldata" ftype="tabular"> <assert_contents> <has_size value="46951" delta="50"/> <has_text text="ATTS_genomic_start"/> </assert_contents> </output> <output name="isoformFeatures" ftype="tabular"> <assert_contents> <has_size value="99310" delta="50"/> <has_text text="gene_overall_mean"/> </assert_contents> </output> </test> <!-- Test 08: analyze single gene--> <test expect_num_outputs="2"> <conditional name="functionMode"> <param name="selector" value="second_step"/> <param name="robject" value="test04.RData"/> <conditional name="analysis_mode"> <param name="selector" value="single"/> <param name="gene" value="NADK"/> </conditional> </conditional> <output name="single_gene" ftype="pdf" file="test08_single_gene.pdf" compare="sim_size"/> <output name="switchList" ftype="rdata"> <assert_contents> <has_size value="531580" delta="300"/> </assert_contents> </output> </test> <!-- Test 09: Kallisto input--> <test expect_num_outputs="1"> <conditional name="functionMode"> <param name="selector" value="data_import"/> <param name="genomeAnnotation" value="annotation_kallisto.gtf.gz"/> <param name="transcriptome" value="transcriptome_kallisto.fasta.gz"/> <param name="countFiles" value="disabled"/> <conditional name="tool_source"> <param name="selector" value="kallisto"/> <section name="first_factor"> <param name="factorLevel" value="health"/> <param name="trans_counts" value="kallisto_cond1_rep1.tsv,kallisto_cond1_rep2.tsv"/> </section> <section name="second_factor"> <param name="factorLevel" value="cancer"/> <param name="trans_counts" value="kallisto_cond2_rep1.tsv,kallisto_cond2_rep2.tsv"/> </section> </conditional> </conditional> <output name="switchList" file="test09.RData" ftype="rdata" compare="sim_size" delta="100"/> </test> <!-- Test 10: Test paired samples in the experimental design--> <test expect_num_outputs="3"> <conditional name="functionMode"> <param name="selector" value="data_import"/> <param name="genomeAnnotation" value="annotation_salmon.gtf.gz"/> <param name="transcriptome" value="transcriptome.fasta.gz"/> <param name="pairedSamples" value="true"/> <param name="countFiles" value="matrix"/> <conditional name="tool_source"> <param name="selector" value="salmon"/> <section name="first_factor"> <param name="factorLevel" value="health"/> <param name="trans_counts" value="salmon_cond1_rep1.sf,salmon_cond1_rep2.sf"/> </section> <section name="second_factor"> <param name="factorLevel" value="cancer"/> <param name="trans_counts" value="salmon_cond2_rep1.sf,salmon_cond2_rep2.sf"/> </section> </conditional> </conditional> <output name="switchList" ftype="rdata"> <assert_contents> <has_size value="652170" delta="300"/> </assert_contents> </output> <output name="sample_annotation" file="test10_samples_annotation.tabular" ftype="tabular"/> </test> </tests> <help><![CDATA[ .. class:: infomark **Purpose** IsoformSwitchAnalyzeR is an easy-to use-R package that enables statistical identification of isoform switching from RNA-seq derived quantification of novel and/or annotated full-length isoforms. IsoformSwitchAnalyzeR facilitates integration of many sources of (predicted) annotation such as Open Reading Frame (ORF/CDS), protein domains (via Pfam), signal peptides (via SignalP), Intrinsically Disordered Regions (IDR, via NetSurfP-2 or IUPred2A), coding potential (via CPAT or CPC2) and sensitivity to Non-sense Mediated Decay (NMD) and more. The combination of identified isoform switches and their annotation enables IsoformSwitchAnalyzeR to predict potential functional consequences of the identified isoform switches — such as loss of protein domains — thereby identifying isoform switches of particular interest. Lastly, IsoformSwitchAnalyzeR provides article-ready visualization methods for isoform switches for individual genes as well as both summary statistics and visualization of the genome-wide changes/consequences of isoform switches, their consequences and the associated alternative splicing. ----- .. class:: infomark **Differential isoform expression (DIE) and differential isoform usage (DIU)** Differential isoform expression (DIE) and differential isoform usage (DIU) are related but distinct concepts. DIE assesses the difference of absolute expression in isoform level. In contrast, DIU assesses the difference of relative expression in isoform level. For example, if the expression of two isoforms of one gene are 10 and 20 in control and 50 and 100 in case, then there is DIE but no DIU because the relative expression of the first isoform is 1/3 in both case and control. ----- .. class:: infomark **ORF identification methods (novel isoform analysis)** - **Longest**: Identifies the longest ORF in the transcript (after filtering via minORFlength). This approach is similar to what the CPAT tool uses in it is analysis of coding potential. - **LongestAnnotated**: Identifies the longest ORF (after filtering via minORFlength) downstream of an annotated translation start site (which are supplied via the cds argument). - **Longest.AnnotatedWhenPossible**: A merge between "longestAnnotated" and "longest". For all isoforms where CDS start positions from known isoform overlap, only these CDS starts are considered and the longest ORF is annotated (similar to "longestAnnotated"). All isoforms without any overlapping CDS start sites they will be analysed with the "longest" approach. - **MostUpstream**: Identifies the most upstream ORF in the transcript (after filtering via minORFlength). - **MostUpstreamAnnoated**: Identifies the ORF (after filtering via minORFlength) downstream of the most upstream overlapping annotated translation start site (supplied via the cds argument). ]]></help> <expand macro="citations" /> </tool>