Mercurial > repos > iuc > gff3_rebase
diff gff3_rebase.xml @ 0:e54940ea270c draft
planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/gff3_rebase commit 350ab347625ed5941873ba0deb3e1cf219d60052
author | iuc |
---|---|
date | Thu, 20 Apr 2017 08:12:49 -0400 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/gff3_rebase.xml Thu Apr 20 08:12:49 2017 -0400 @@ -0,0 +1,93 @@ +<tool id="gff3.rebase" name="Rebase GFF3 features" version="1.2"> + <description>against parent features</description> + <macros> + <import>macros.xml</import> + </macros> + <expand macro="requirements"/> + <version_command>python gff3_rebase.py --version</version_command> + <command detect_errors="aggressive"><![CDATA[ + python '$__tool_directory__/gff3_rebase.py' + '$parent' + '$child' + + $interpro + $protein2dna + > '$output']]> + </command> + <inputs> + <param label="Parent GFF3 annotations" name="parent" format="gff3" type="data"/> + <param label="Child GFF3 annotations to rebase against parent" name="child" format="gff3" type="data"/> + + <param label="Interpro specific modifications" name="interpro" type="boolean" truevalue="--interpro" falsevalue=""/> + <param label="Map protein translated results to original DNA data" name="protein2dna" type="boolean" truevalue="--protein2dna" falsevalue=""/> + </inputs> + <outputs> + <data format="gff3" name="output"/> + </outputs> + <tests> + <test> + <param name="parent" value="parent.gff"/> + <param name="child" value="child.gff"/> + <param name="protein2dna" value="True" /> + <output name="output" file="proteins.gff"/> + </test> + </tests> + <help><![CDATA[ +**What it does** + +Often the genomic data processing/analysis process requires a workflow like the following: + +- select some features from a genome +- export the sequences associated with those regions +- analyse those exports with some tool like Blast + +For display, especially in software like JBrowse, it is convenient to know +where in the original genome the analysis results would fall. E.g. if a +transmembrane domain is detected at bases 10-20 of an analysed protein, where +should this be displayed relative to the parent genome? + +This tool helps fill that gap, by rebasing some analysis results against the +parent features which were originally analysed. + +**Example Inputs** + +For a "child" set of annotations like:: + + #gff-version 3 + cds42 blastp match_part 1 50 1e-40 . . ID=m00001;Notes=RNAse A Protein + +And a parent set of annotations like:: + + #gff-version 3 + PhageBob maker cds 300 600 . + . ID=cds42 + +One could imagine that during the analysis process, the user had exported the parent annotation into some sequence:: + + >cds42 + M...... + +and then analysed those results, producing the "child" annotation file. This +tool will then localize the results properly against the parent:: + + #gff-version 3 + PhageBob blastp match_part 300 449 1e-40 + . ID=m00001;Notes=RNAse A Protein + +which will allow you to display the results in the correct location in visualizations. + +**Options** + +There are two optional flags which can be passed. + +The Interpro flag selectively ignores features which shouldn't be included in +the output (i.e. ``polypeptide``), and a couple of qualifiers that aren't +useful (``status``, ``Target``) + +The "Map Protein..." flag says that you translated the sequences during the +genomic export process, analysing protein sequences. This indicates to the +software that the bases should be multiplied by three to obtain the correct DNA +locations. +]]></help> + <citations> + <citation type="doi">10.1093/bioinformatics/btp163</citation> + </citations> +</tool>