Mercurial > repos > iuc > proteinortho_summary
view proteinortho_summary.xml @ 6:2bb8251d132c draft
planemo upload for repository https://gitlab.com/paulklemm_PHD/proteinortho commit ebbb4da7c0c0176df539f9baa7c9323d9ff5f201
author | iuc |
---|---|
date | Tue, 31 Oct 2023 16:31:54 +0000 |
parents | de26b312c0d2 |
children |
line wrap: on
line source
<tool id="proteinortho_summary" name="Proteinortho summary" version="@TOOL_VERSION@+galaxy@WRAPPER_VERSION@" profile="@PROFILE@"> <description>summaries the orthology-pairs/RBH files</description> <macros> <import>proteinortho_macros.xml</import> </macros> <expand macro="biotools"/> <expand macro="requirements"/> <expand macro="version_command"/> <command detect_errors="exit_code"><![CDATA[ export TERM=dumb && proteinortho_summary.pl $queryfile #if $queryfile2: '$queryfile2' #end if 2>&1 | awk '/^$/ && !f{f=1;next}1' ## remove potentially present 1st empty line | awk 'BEGIN{i=0} /^$/{i+=1}{print > ("output" i ".tsv")}' ## split file at empty lines && mv output0.tsv '$distribution' && mv output1.tsv '$adjacencyMat' && mv output2.tsv '$average1paths' && mv output3.tsv '$adjacencyMatSquared' && mv output4.tsv '$average2paths' ]]></command> <inputs> <param name="queryfile" type="data" format="tabular" label="A orthology-pairs / RBH file"/> <param name="queryfile2" type="data" format="tabular" optional="true" label="(optional) A second orthology-pairs / RBH file" help="If you provide a second file, then difference is calculated (GRAPH - second GRAPH)"/> </inputs> <outputs> <data name="distribution" format="tabular" label="${tool.name} on ${on_string}: Protein-Group distribution"/> <data name="adjacencyMat" format="tabular" label="${tool.name} on ${on_string}: Adjacency Matrix"/> <data name="average1paths" format="tabular" label="${tool.name} on ${on_string}: Average number of Edges"/> <data name="adjacencyMatSquared" format="tabular" label="${tool.name} on ${on_string}: Matrix of 2-paths"/> <data name="average2paths" format="tabular" label="${tool.name} on ${on_string}: Average number of 2-paths"/> </outputs> <tests> <test expect_num_outputs="5"> <param name="queryfile" value="result.proteinortho-graph"/> <output name="distribution"> <assert_contents> <has_text text="%"/> </assert_contents> </output> <output name="adjacencyMat"> <assert_contents> <has_text text="18"/> <has_text text="14"/> <has_text text="TERM" negate="true"/> </assert_contents> </output> <output name="average1paths"> <assert_contents> <has_text text="9.6"/> <has_text text="15"/> <has_text text="TERM" negate="true"/> </assert_contents> </output> <output name="adjacencyMatSquared"> <assert_contents> <has_text text="750"/> <has_text text="74"/> <has_text text="TERM" negate="true"/> </assert_contents> </output> <output name="average2paths"> <assert_contents> <has_text text="1088.8"/> <has_text text="1374.2"/> <has_text text="TERM" negate="true"/> </assert_contents> </output> </test> <test expect_num_outputs="5"> <param name="queryfile" value="result.proteinortho-graph"/> <param name="queryfile2" value="result.blast-graph"/> <output name="average2paths"> <assert_contents> <has_text text="49.6"/> <has_text text="59.8"/> <has_text text="TERM" negate="true"/> </assert_contents> </output> </test> <test expect_num_outputs="5"> <param name="queryfile" value="result.blast-graph"/> <output name="average2paths"> <assert_contents> <has_text text="115.2"/> <has_text text="TERM" negate="true"/> </assert_contents> </output> </test> </tests> <help><![CDATA[proteinortho summary **What it does** proteinortho_summary : Summaries the (orthology-pairs/RBH) file(s) to determine how well the species are connected to each other. * **Protein-Group distribution** (for orthology-pairs) : This report contains overal statistics about the output. (i) Number of groups that contains at least p% input species (with p ranging between 0 and 100). (ii) number of groups for each input species. * **Adjacency Matrix** : How well are the species connected to each other directly. * **Average number of Edges** : Averaged number of connections for each species. * **Matrix of 2-paths** : The square of the adjacency matrix = The number of paths of length 2 between two species. * **Average number of 2-paths** : The average number of 2-paths for each species. If a species is not well connected to all the other species, it will result in a low average. If you supply a second orthology-pairs/RBH then the difference is calculated for all 4 outputs. E.g. given the RBH and the orthology-pairs of the same run : The outputs show how much the clustering removed from the initial reciprocal best hit graph. Or given 2 orthology-pairs from the same set of fasta files with different parameters (evalue,...) : The output show how the parameters change the connectivity of the output. **Other Proteinortho-Tools for downstream analysis** * `proteinortho grab proteins` : find proteins/genes in a given fasta file and retrieve their sequence(s). You can also use a orthology-groups file. More information can be found on github https://gitlab.com/paulklemm_PHD/proteinortho ]]> </help> <expand macro="citations"/> </tool>