Mercurial > repos > iuc > gemini_gene_wise
view gemini_gene_wise.xml @ 8:e57a1b0ac6be draft default tip
planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/gemini commit f7bdf08922aaf4119aefe7041e754a69cf64aebd
author | iuc |
---|---|
date | Wed, 13 Jul 2022 15:30:49 +0000 |
parents | 4b26f6c99227 |
children |
line wrap: on
line source
<tool id="gemini_@BINARY@" name="GEMINI @BINARY@" version="@VERSION@+galaxy1"> <description>Discover per-gene variant patterns across families</description> <expand macro="bio_tools"/> <macros> <import>gemini_macros.xml</import> <token name="@BINARY@">gene_wise</token> </macros> <expand macro="requirements" /> <expand macro="stdio" /> <expand macro="version_command" /> <command> <![CDATA[ gemini @BINARY@ #if int($min_filters) > 0: --min-filters $min_filters #end if #for $filter in $filter_by_genotype: #set $multiline_sql_expr = str($filter.gt_filter) #if $filter.is_required: #set $cmdln_param = "--gt-filter-required" #else: #set $cmdln_param = "--gt-filter" #end if @MULTILN_SQL_EXPR_TO_CMDLN@ #end for #set $report = $oformat.report @COLUMN_SELECT@ #set $where_clause_elements = [] #set $filter_cmdln_param = '--filter' #for $cond in $constraint: #if str($cond.filter).strip(): #silent $where_clause_elements.append(str($cond.filter).strip()) #if $cond.overwrite_default_filter: #set $filter_cmdln_param = '--where' #end if #end if #end for @PARSE_REGION_ELEMENTS@ #if $region_elements: #silent $where_clause_elements.append(" OR ".join($region_elements)) #end if #set $filter = " AND ".join($where_clause_elements) #if str($filter): $filter_cmdln_param '$filter' #end if '$infile' > '$outfile' ]]> </command> <inputs> <expand macro="infile" /> <expand macro="gt_filter" default_repeat="1" min_repeat="1" max_repeat="999"> <param name="is_required" type="boolean" checked="False" label="Make this an obligate filter that a variant has to pass to be considered" help="By default, a variant has to pass a minimum number of genotype filters (set below) to get reported. By making a filter required, you ensure that variants that fail this one filter are always excluded. Required filters that a variant passes do not count towards its number of passed (regular) filters" /> </expand> <param name="min_filters" type="integer" value="1" min="1" label="Minimum number of filters" help="(--min-filters)" /> <expand macro="region_filter" /> <expand macro="insert_constraint"> <expand macro="overwritable_where_default" default_where="exonic, high impact variants (SQL clause: is_exonic = 1 and impact_severity != 'LOW')" /> </expand> <section name="oformat" title="Output - included information" expanded="true"> <expand macro="column_filter" /> </section> </inputs> <outputs> <data name="outfile" format="tabular" /> </outputs> <tests> <test> <param name="infile" value="gemini_amend_input.db" ftype="gemini.sqlite" /> <repeat name="filter_by_genotype"> <param name="gt_filter" value="((gt_depths).(*).(>=1).(all))" /> </repeat> <output name="outfile"> <assert_contents> <has_line_matching expression="variant_id	gene.*" /> </assert_contents> </output> </test> </tests> <help> <![CDATA[ **What it does** This tool extends the *GEMINI inheritance pattern* tool in that it lets you search for custom gene-wise inheritance patterns of variants, instead of fixed ones. See also: the `command line tool documentation <https://gemini.readthedocs.io/en/latest/content/tools.html#gene-wise-custom-genotype-filtering-by-gene>`__ ----- *Genotype filters* The syntax for specifying a genotype filter (``--gt-filter`` command line option) is the same as for the *GEMINI query* tool and is described `here <https://gemini.readthedocs.io/en/latest/content/querying.html#gt-filter-filtering-on-genotypes>`__. The difference with the *gene_wise* tool is that it lets you specify multiple such filters and, if you do, every filter can be met by a **different variant** as long as all of them are in the **same gene**. This is useful if your analysis includes several families that you suspect (based on a shared phenotype) to have the same gene affected, but not necessarily through the same variant. In this case, you can formulate one filter per family like, for example:: gt_types.fam1_kid == HET and gt_types.fam1_mom == HOM_REF and gt_types.fam1_dad == HOM_REF gt_types.fam2_kid == HET gt_types.fam3_kid == HET , which would allow you to find a causal gene that's affected by different (dominant) variants in children from three different families. Note that the first filter combines three conditions applied to family 1, which, thus, must be met by the same variant site. *Regular and required filters* (``--gt-filter`` *vs* ``--gt-filter-required``) and the *Minimum number of filters* For every single genotype filter you define you can specify whether it should be applied as a regular or as a required filter. The difference is that, if a variant doesn't pass a required filter it is excluded from further analysis. Of the regular filters, a gene and its variants only have to pass a threshold number defined by *Minimum number of filters* (``--min-filters``). Imagine, with the above filters you had specified ``--min-filters`` as ``2``, then a gene for which the child in family 3 carries one copy of a variant allele and the child in family 3 carries a copy of a different allele would be reported no matter if any other allele in that gene passes the first filter, *etc.*. ----- *Region filters* They let you restrict your analysis to parts of the genome, which can be useful if you have prior knowledge of the approximate location of the causative gene. If you specify more then one region filter, they get combined with a logical *OR*, meaning variants and genes falling in *any* of the regions are reported. ----- *Additional constraints on variants* These get translated directly into the WHERE clause of an SQL query and, thus, have to be expressed in valid SQL syntax. Of particular interest, here, is the fact that, by default, the *gene-wise* tool applies the WHERE clause: ``is_exonic = 1 and impact_severity != 'LOW'``, which means the tool only considers variants in exons that are not of *LOW* impact severity (*i.e.*, not silent mutations). While this can be a good and biologically justifiable setting, you can overwrite it if you need. Note that in SQL syntax tests for equality use a single ``=``, while genotype filters (discussed above) are following Python syntax and use ``==`` for the same purpose. Also note that non-numerical values need to be enclosed in single-quotes, *e.g.* ``'LOW'``, but numerical values must *NOT* be. ]]> </help> <expand macro="citations"/> </tool>