view vcffilter.xml @ 1:fd10ad2f8f22

Updated to vcflib 86723982aa
author Anton Nekrutenko <>
date Wed, 25 Jun 2014 16:14:19 -0400
parents f84ad3b2d320
children a1c14d64b003
line wrap: on
line source

<tool id="vcffilter" name="VCFfilter:" version="0.0.1">
    <requirement type="package" version="86723982aa">vcflib</requirement>
  <description>filter VCF data in a variety of attributes</description>
  <!-- This tools depends on tabix functionality, which is currently distributed with Galaxy itself via a pysam egg -->
    ln -s "${input1}" input1.vcf.gz &amp;&amp;
    ln -s "${Tabixized_input}" input1.vcf.gz.tbi &amp;&amp;
    vcffilter ${filterList} input1.vcf.gz  > "${out_file1}"
    <param name="filterList" size="40" type="text" value="-f &quot;DP &gt; 10&quot;" label="Specify filterting expression" help="See explanation of filtering options below">
        <valid initial="string.printable">
	  <remove value="&apos;"/>
        <mapping initial="none">
          <add source="&apos;" target="__sq__"/>
    <param format="vcf_bgzip" name="input1" type="data" label="From">
      <conversion name="Tabixized_input" type="tabix" />
    <data format="vcf" name="out_file1" />
      <param name="filterList" value="-f &quot;DP &gt; 10&quot;"/>
      <param name="input1" value="vcflib.vcf"/>
      <output name="out_file1" file="vcffilter-test1.vcf"/>

You can specify the following option the **Specify filtering expression** box in any combination::

    -f, --info-filter     specifies a filter to apply to the info fields of records, removes alleles which do not pass the filter
    -g, --genotype-filter specifies a filter to apply to the genotype fields of records
    -s, --filter-sites    filter entire records, not just alleles
    -t, --tag-pass        tag vcf records as positively filtered with this tag, print all records
    -F, --tag-fail        tag vcf records as negatively filtered with this tag, print all records
    -A, --append-filter   append the existing filter tag, don't just replace it
    -a, --allele-tag      apply -t on a per-allele basis.  adds or sets the corresponding INFO field tag
    -v, --invert          inverts the filter, e.g. grep -v
    -o, --or              use logical OR instead of AND to combine filters
    -r, --region          specify a region on which to target the filtering (must be used in conjunction with -f or -g

Filters are specified in the form {ID}  {operator} {value}::

 -f "DP > 10"          # for info fields
 -g "GT = 1|1"         # for genotype fields
 -f "CpG"              # for 'flag' fields

Any number of filters may be specified.  They are combined via logical AND unless --or is specified on the command line. For convenience, you can specify "QUAL" to refer to the quality of the site, even though it does not appear in the INFO fields.  

Operators can be any of: =, !, &lt;, &gt;, pipe, &amp;

To restrict output to a specific location use -r option (much be used in conjunction with -g or -f)::

 -r chr20:14000-15000  # only output calls between positions 14,000 and 15,000 on chromosome 20
 -r chrX               # only output call on chromosome X


Vcffilter is a part of VCFlib toolkit developed by Erik Garrison (