view gemini_comp_hets.xml @ 2:ed0b0677dcfe draft

planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/gemini commit 11ee7ac206d41894c0b6a11f2439aaea490824f0
author iuc
date Thu, 09 Nov 2017 13:17:10 -0500
parents d666cc4a37e7
children 62647e2637f6
line wrap: on
line source

<tool id="gemini_@BINARY@" name="GEMINI @BINARY@" version="@VERSION@.0">
    <description>Identifying potential compound heterozygotes</description>
    <macros>
        <import>gemini_macros.xml</import>
        <token name="@BINARY@">comp_hets</token>
    </macros>
    <expand macro="requirements" />
    <expand macro="stdio" />
    <expand macro="version_command" />
    <command>
<![CDATA[
        gemini @BINARY@

            @COLUMN_SELECT@

            @CMDLN_SQL_FILTER_FILTER_OPTION@

            #if int($min_kindreds) > 0:
                --min-kindreds $min_kindreds
            #end if

            #if str($families).strip():
                --families "$families"
            #end if

            -d $d
            $allow_unaffected

            #if int($min_genotypequality) > 0:
                --min-gq $min_genotypequality
            #end if

            #if int($gt_pl_max) != -1:
                --gt_pl_max $gt_pl_max
            #end if

            $pattern_only
            --max-priority $max_priority


            "${ infile }"
            > "${ outfile }"
]]>
    </command>
    <inputs>
        <expand macro="infile" />
        <expand macro="column_filter" />
        <expand macro="filter" />
        <expand macro="min_kindreds" />
        <expand macro="family" />
        <expand macro="unaffected" />
        <expand macro="min_sequence_depth" />
        <param name="min_genotypequality" type="integer" value="0" min="0" label="The minimum genotype quality required for each sample in a family." help="default: 0 (--min-gq)" />
        <param name="gt_pl_max" type="integer" value="-1" min="-1" label="The maximum phred-scaled genotype (PL) allowed for each sample in a family." help="default: -1 not set (--gt-pl-max)" />
        <param name="pattern_only" type="boolean" truevalue="--pattern-only" falsevalue="" checked="False"
            label="Find compound hets by inheritance pattern, without regard to affection" help="(--pattern-only)"/>
        <param name="max_priority" type="integer" value="1" min="1" label="Default is to show only confident compound hets. Set to 2 or higher to include pairs that are less likely true comp-hets." help="default: 1 (--max-priority)" />
    </inputs>
    <outputs>
        <data name="outfile" format="tabular" />
    </outputs>
    <tests>
        <test>
            <param name="infile" value="gemini_comphets_input.db" ftype="gemini.sqlite" />
            <param name="report_selector" value="column_filter" />
            <param name="columns" value="chrom,start,end,ref,alt,gene,impact" />
            <param name="allow_unaffected" value="True" />
            <param name="max_priority" value="3" />
            <output name="outfile" file="gemini_comphets_result.tabular" />
        </test>
    </tests>
    <help>
**What it does**

Many recessive disorders are caused by compound heterozygotes. Unlike canonical recessive sites where the same recessive allele is
inherited from both parents at the _same_ site in the gene, compound heterozygotes occur when the individual’s phenotype is caused
by two heterozygous recessive alleles at _different_ sites in a particular gene.

So basically, we are looking for two (typically loss-of-function (LoF)) heterozygous variants impacting the same gene at different loci.
The complicating factor is that this is _recessive_ and as such, we must also require that the consequential alleles at each heterozygous
site were inherited on different chromosomes (one from each parent). As such, in order to use this tool, we require that all variants are phased.
Once this has been done, the comp_hets tool will provide a report of candidate compound heterozygotes for each sample/gene.


    </help>
    <expand macro="citations"/>
</tool>