view funannotate_clean.xml @ 1:6298bb90475b draft

"planemo upload commit 87560553f1dbbd3e0ab7d7157fa5a7f32f61dca1"
author iuc
date Mon, 04 Oct 2021 19:36:02 +0000
parents b5ec3983deda
children 922ff4f431b3
line wrap: on
line source

<tool id="funannotate_clean" name="Funannotate assembly clean" profile="20.01" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@">
    <description></description>
    <macros>
        <import>macros.xml</import>
    </macros>
    <requirements>
        <expand macro="requirements" />
    </requirements>
    <version_command>funannotate check --show-versions</version_command>
    <command><![CDATA[
funannotate clean
--input '${input}'
--out '${output}'
--pident ${pident}
--cov ${cov}
--minlen ${minlen}
${exhaustive}
    ]]></command>
    <inputs>
        <param argument="--input" type="data" format="fasta" label="Assembly to clean" />

        <param argument="--pident" type="integer" value="95" label="Percent identity of overlap" />
        <param argument="--cov" type="integer" value="95" label="Percent coverage of overlap" />
        <param argument="--minlen" type="integer" value="500" label="Minimum length of contig to keep" />

        <param argument="--exhaustive" type="boolean" checked="false" truevalue="--exhaustive" falsevalue="" label="Test every contig" help="Default is to stop at N50 value" />
    </inputs>
    <outputs>
        <data name='output' format='fasta' label="${tool.name} on ${on_string}: cleaned assembly" />
    </outputs>
    <tests>
        <test>
            <param name="input" value="genome.fa" />
            <output name="output" file="cleaned.fa" compare="diff" />
        </test>
        <test>
            <param name="input" value="genome.fa" />
            <param name="pident" value="100" />
            <param name="cov" value="100" />
            <output name="output" file="cleaned_ident.fa" compare="diff" />
        </test>
    </tests>
    <help><![CDATA[
Funannotate_ clean
------------------

Funannotate_ is a pipeline for genome annotation (built specifically for fungi, but will also work with higher eukaryotes).

When working with haploid assemblies, sometimes you want to remove some repetitive contigs that are contained in other scaffolds of the assembly. If the repeats are indeed unique, then we want to keep them in the assembly. Funannotate can help “clean” up repetitive contigs in your assembly. This is done using a “leave one out” methodology using minimap2 or mummer (nucmer), where the the shortest contigs/scaffolds are aligned to the rest of the assembly to determine if it is repetitive. The script loops through the contigs starting with the shortest and workings its way to the N50 of the assembly, dropping contigs/scaffolds that are greater than the percent coverage of overlap (--cov) and the percent identity of overlap (--pident).

.. _Funannotate: http://funannotate.readthedocs.io
    ]]></help>
    <expand macro="citations" />
</tool>