Mercurial > repos > ufz > phabox_phavip

<tool id="phabox_phavip" name="PhaBOX phavip" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="23.1" license="MIT">
    <description>Protein annotation</description>
    <macros>
        <import>macros.xml</import>
    </macros>
    <xrefs>
        <xref type="bio.tools">phabox</xref>
    </xrefs>
    <requirements>
        <requirement type="package" version="@TOOL_VERSION@">phabox</requirement>
    </requirements>
    <command detect_errors="exit_code"><![CDATA[
        phabox2 --task phavip
            @GENERAL@
    ]]></command>
    <inputs>
        <expand macro="general"/>
        <param name="supp_outputs" type="select" optional="true" label="Supplementary outputs">
            <option value="gene_annotation">Gene annotation</option>
        </param>
    </inputs>
    <outputs>
        <data name="out" format="tabular" from_work_dir="output/final_prediction/phavip_prediction.tsv"/>
        <data name="phavip_annotation" format="tabular" from_work_dir="output/final_prediction/phavip_supplementary/gene_annotation.tsv" label="${tool.name} on ${on_string}: Gene annotation">
            <filter>supp_outputs and "gene_annotation" in supp_outputs</filter>
        </data>
    </outputs>
    <tests>
        <test expect_num_outputs="1">
            <param name="dbdir" value="phaboxdb"/>
            <param name="contigs" value="example_contigs.fa"/>
            <output name="out">
                <assert_contents>
                    <has_line line="Accession&#9;Length&#9;Protein_num&#9;Annotated_num&#9;Annotation_rate"/>
                    <has_n_lines n="11"/>
                    <has_n_columns n="5"/>
                </assert_contents>
            </output>
        </test>
        <test expect_num_outputs="2">
            <param name="dbdir" value="phaboxdb"/>
            <param name="contigs" value="example_contigs.fa"/>
            <param name="supp_outputs" value="gene_annotation"/>
            <output name="out">
                <assert_contents>
                    <has_line line="Accession&#9;Length&#9;Protein_num&#9;Annotated_num&#9;Annotation_rate"/>
                    <has_n_lines n="11"/>
                    <has_n_columns n="5"/>
                </assert_contents>
            </output>
            <output name="phavip_annotation">
                <assert_contents>
                    <has_line line="Genome&#9;ORF&#9;Start&#9;End&#9;Strand&#9;GC&#9;Annotation&#9;pident&#9;coverage"/>
                    <has_n_lines n="171"/>
                    <has_n_columns n="9"/>
                </assert_contents>
            </output>
        </test>
    </tests>
    <help><![CDATA[

Please note that running task end_to_end, phamer, phagcn, phatyp, and cherry, will automatically run phavip.
The output files are the same but the supplementary files will be dumped into the corresponding task.


@COMMON_INPUT_DOC@

**Output**

@COMMON_OUTPUT_DOC@
- Protein_num: total number of predicted proteins.
- Annotated_num: number of proteins that have significant alignments.
- Annotation_rate: percentage of proteins that have annotations.

In addition the gene annotation itself can be produced:

- Genome: the accession or the name of the input contigs.
- ORF: the ID of the translated protein.
- Start: start position on the genome.
- End: end position on the genome.
- Strand: forward (1) or backward(-1).
- GC: GC content.
- Annotation: the annotation of the proteins.

Please note that there are two kinds of hypothetical protein:

- hypothetical protein (no hit): a protein has no alignment results to the reference database.
- hypothetical protein (no hit): a protein has alignment results but the annotation is "hypothetical protein"


    ]]></help>
    <expand macro="citations"/>
</tool>