Mercurial > repos > iuc > proteinortho_summary
view proteinortho_summary.xml @ 4:9b67a50799e9 draft
planemo upload for repository https://gitlab.com/paulklemm_PHD/proteinortho commit 84c463c8317d7c16c2b86b1d8657932cc0f39791
author | iuc |
---|---|
date | Tue, 22 Nov 2022 16:50:38 +0000 |
parents | c3f58c2eee1e |
children | de26b312c0d2 |
line wrap: on
line source
<tool id="proteinortho_summary" name="Proteinortho summary" version="@TOOL_VERSION@+galaxy@WRAPPER_VERSION@" profile="@PROFILE@"> <description>summaries the orthology-pairs/RBH files</description> <macros> <import>proteinortho_macros.xml</import> </macros> <expand macro="requirements"/> <expand macro="version_command"/> <command detect_errors="exit_code"><![CDATA[ export TERM=dumb && ## TODOs: ## - check if 2>&1 can be removed https://gitlab.com/paulklemm_PHD/proteinortho/-/merge_requests/9 ## - include output0.tsv as Galaxy output? proteinortho_summary.pl $queryfile #if $queryfile2: '$queryfile2' #end if 2>&1 | awk '/^$/ && !f{f=1;next}1' ## remove potentially present 1st empty line | awk 'BEGIN{i=0} /^$/{i+=1}{print > ("output" i ".tsv")}' ## split file at empty lines && mv output1.tsv '$adjacencyMat' && mv output2.tsv '$average1paths' && mv output3.tsv '$adjacencyMatSquared' && mv output4.tsv '$average2paths' ]]></command> <inputs> <param name="queryfile" type="data" format="tabular" label="A orthology-pairs / RBH file"/> <param name="queryfile2" type="data" format="tabular" optional="true" label="(optional) A second orthology-pairs / RBH file" help="If you provide a second file, then difference is calculated (GRAPH - second GRAPH)"/> </inputs> <outputs> <data name="adjacencyMat" format="tabular" label="${tool.name} on ${on_string}: Adjacency Matrix"/> <data name="average1paths" format="tabular" label="${tool.name} on ${on_string}: Average number of Edges"/> <data name="adjacencyMatSquared" format="tabular" label="${tool.name} on ${on_string}: Matrix of 2-paths"/> <data name="average2paths" format="tabular" label="${tool.name} on ${on_string}: Average number of 2-paths"/> </outputs> <tests> <test expect_num_outputs="4"> <param name="queryfile" value="result.proteinortho-graph"/> <output name="adjacencyMat"> <assert_contents> <has_text text="18"/> <has_text text="14"/> <has_text text="TERM" negate="true"/> </assert_contents> </output> <output name="average1paths"> <assert_contents> <has_text text="9.6"/> <has_text text="15"/> <has_text text="TERM" negate="true"/> </assert_contents> </output> <output name="adjacencyMatSquared"> <assert_contents> <has_text text="750"/> <has_text text="74"/> <has_text text="TERM" negate="true"/> </assert_contents> </output> <output name="average2paths"> <assert_contents> <has_text text="1088.8"/> <has_text text="1374.2"/> <has_text text="TERM" negate="true"/> </assert_contents> </output> </test> <test expect_num_outputs="4"> <param name="queryfile" value="result.proteinortho-graph"/> <param name="queryfile2" value="result.blast-graph"/> <output name="average2paths"> <assert_contents> <has_text text="49.6"/> <has_text text="59.8"/> <has_text text="TERM" negate="true"/> </assert_contents> </output> </test> </tests> <help><![CDATA[proteinortho summary **What it does** proteinortho_summary : Summaries the (orthology-pairs/RBH) file(s) to determine how well the species are connected to each other. * **Adjacency Matrix** : How well are the species connected to each other directly. * **Average number of Edges** : Averaged number of connections for each species. * **Matrix of 2-paths** : The square of the adjacency matrix = The number of paths of length 2 between two species. * **Average number of 2-paths** : The average number of 2-paths for each species. If a species is not well connected to all the other species, it will result in a low average. If you supply a second orthology-pairs/RBH then the difference is calculated for all 4 outputs. E.g. given the RBH and the orthology-pairs of the same run : The outputs show how much the clustering removed from the initial reciprocal best hit graph. Or given 2 orthology-pairs from the same set of fasta files with different parameters (evalue,...) : The output show how the parameters change the connectivity of the output. **Other Proteinortho-Tools for downstream analysis** * `proteinortho grab proteins` : find proteins/genes in a given fasta file and retrieve their sequence(s). You can also use a orthology-groups file. More information can be found on github https://gitlab.com/paulklemm_PHD/proteinortho ]]> </help> <expand macro="citations"/> </tool>