Mercurial > repos > iuc > mummer_show_coords

<tool id="mummer_show_coords" name="Show-Coords" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@">
    <description>Parse delta file and report coordinates and other information</description>
    <macros>
        <import>macros.xml</import>
    </macros>
    <expand macro="bio_tools"/>
    <expand macro="requirements" />
    <command detect_errors="exit_code">
        <![CDATA[
        show-coords
            $merge
            $direction
            -c
            -H
            -I '$identity'
            -l
            -L '$min_alignment_length'
            $annotate
            $sort
            -T
            '${delta}'
            #if $direction:
                >  '${output_extend}'
            #else:
               >  '${output}'
            #end if
        ]]>
    </command>
    <inputs>
        <param name="delta" type="data" format="tabular" label="Match file from Nucmer" />
        <param name="merge" type="boolean" argument="-b" truevalue="-b" falsevalue="" label="Merge"
            help="Merges overlapping alignments regardless of match dir or frame and does not display any idenitity information. (-b)" />
        <param name="identity" type="float" argument="-I" value="75.0" label="Identity" help="Set minimum percent identity to display (-I)" />

        <param name="direction" type="boolean" argument="-d"  truevalue="-d" falsevalue="" label="Direction"  help="Display the alignment direction in the additional FRM columns (-d)" />
        <param name="min_alignment_length" type="integer" argument="-L" value="100" label="Minimum Alignment Length" help="Set minimum alignment length to display (-L)" />
        <param name="annotate" type="boolean" argument="-o" truevalue="-o" falsevalue="" label="Annotate"
            help="Annotate maximal alignments between two sequences, i.e. overlaps between reference and query sequences (-o)" />
        <param name="sort" type="select" label="Sorting strategy for output" >
            <option value="-q">Sort output by query IDs and coordinates (-q)</option>
            <option value="-r">Sort output by reference IDs and coordinates (-r)</option>
        </param>

    </inputs>
    <outputs>
        <data name="output" format="tabular" from_work_dir="show-coords.txt"   label="${tool.name} on ${on_string} default format)">
            <filter>direction is False</filter>
            <actions>
                <action name="column_names" type="metadata" default="[S1], [E1], [S2], [E2], [LEN 1], [LEN 2], [% IDY], [LEN R], [LEN Q], [COV R], [COV Q], [REF TAG], [QUERY TAG]" />
            </actions>
        </data>
        <data name="output_extend" format="tabular" from_work_dir="show-coords_extend.txt"   label="${tool.name} on ${on_string} extend format)" >
            <filter>direction is True</filter>
            <actions>
                <action name="column_names" type="metadata" default="[S1], [E1], [S2], [E2], [LEN 1], [LEN 2], [% IDY], [LEN R], [LEN Q], [COV R], [COV Q], [FRM R], [FRM Q], [REF TAG], [QUERY TAG]" />
            </actions>
        </data>
    </outputs>
    <tests>
        <test expect_num_outputs="1">
            <param name="delta" ftype="tabular" value="nucmer.txt" />
            <param name="direction" value="False" />
            <output name="output" ftype="tabular" compare="diff" value="show-coords.txt" />
        </test>
        <test expect_num_outputs="1">
            <param name="delta" ftype="tabular" value="nucmer.txt" />
            <param name="direction" value="True" />
            <output name="output_extend" ftype="tabular" compare="diff" value="show-coords_extend.txt" />
        </test>
    </tests>
    <help><![CDATA[
This program parses the delta alignment output of nucmer and displays the coordinates, and other useful information about the alignments.

Output is tabular. Below is a description of each column (Default):
    * **[S1]** Start of the alignment region in the reference sequence
    * **[E1]** End of the alignment region in the reference sequence
    * **[S2]** Start of the alignment region in the query sequence
    * **[E2]** End of the alignment region in the query sequence
    * **[LEN 1]** Length of the alignment region in the reference sequence, measured in nucleotides
    * **[LEN 2]** Length of the alignment region in the query sequence, measured in nucleotides
    * **[% IDY]** Percent identity of the alignment, calculated as (number of exact matches) / ([LEN 1] + insertions in the query)
    * **[LEN R]** Length of the reference sequence
    * **[LEN Q]** Length of the query sequence
    * **[COV R]** Percent coverage of the alignment on the reference sequence, calculated as [LEN 1] / [LEN R]
    * **[COV Q]** Percent coverage of the alignment on the query sequence, calculated as [LEN 2] / [LEN Q]
    * **[FRM R]** Reading frame for the reference sequence (only with -d)
    * **[FRM Q]** Reading frame for the query sequence (only with -d)
    * **[REF TAG]** The reference FastA ID
    * **[QUERY TAG]** The query FastA ID

There is also an optional final column (turned on with the "Annotate" parameter) that will contain some 'annotations'. The Annotate option will annotate alignments that represent overlaps between two sequences. Sometimes, nucmer will extend adjacent clusters past one another, thus causing a somewhat redundant output, this option will notify users of such rare occurrences.

The Percent Coverage and Sequence Length options are useful when comparing two sets of assembly contigs, in that these options help determine if an alignment spans an entire contig, or is just a partial hit to a different read. The Merge option is useful when the user wishes to identify sytenic regions between two genomes, but is not particularly interested in the actual alignment similarity or appearance. This option also disregards match orientation, so should not be used if this information is needed.

**Options:**::

    -b      Merges overlapping alignments regardless of match dir or frame and does not display any
            identity information.
    -d      Display the alignment direction in the additional FRM columns
    -I      Set minimum percent identity to display

    -L      Set minimum alignment length to display

    -o      Annotate maximal alignments between two sequences, i.e. overlaps between reference and query
            sequences

    -q      Sort output lines by query IDs and coordinates

    -r      Sort output lines by reference IDs and coordinates

    ]]></help>
    <expand macro="citation" />
</tool>
author	iuc
date	Tue, 04 Feb 2025 09:18:17 +0000
parents	b4124df98b26
children