What it does
Often the genomic data processing/analysis process requires a workflow like the following:
For display, especially in software like JBrowse, it is convenient to know where in the original genome the analysis results would fall. E.g. if a transmembrane domain is detected at bases 10-20 of an analysed protein, where should this be displayed relative to the parent genome?
This tool helps fill that gap, by rebasing some analysis results against the parent features which were originally analysed.
For a "child" set of annotations like:
#gff-version 3 cds42 blastp match_part 1 50 1e-40 . . ID=m00001;Notes=RNAse A Protein
And a parent set of annotations like:
#gff-version 3 PhageBob maker cds 300 600 . + . ID=cds42
One could imagine that during the analysis process, the user had exported the parent annotation into some sequence:
and then analysed those results, producing the "child" annotation file. This tool will then localize the results properly against the parent:
#gff-version 3 PhageBob blastp match_part 300 449 1e-40 + . ID=m00001;Notes=RNAse A Protein
which will allow you to display the results in the correct location in visualizations.
There are two optional flags which can be passed.
The Interpro flag selectively ignores features which shouldn't be included in the output (i.e. polypeptide), and a couple of qualifiers that aren't useful (status, Target)
The "Map Protein..." flag says that you translated the sequences during the genomic export process, analysing protein sequences. This indicates to the software that the bases should be multiplied by three to obtain the correct DNA locations.