Previous changeset 20:7ac17b6d031e (2015-12-18) Next changeset 22:96f393ad7fc6 (2015-12-23) |
Commit message:
Uploaded |
removed:
LICENSE phe/__init__.py phe/metadata/__init__.py phe/variant/GATKVariantCaller.py phe/variant/MPileupVariantCaller.py phe/variant/__init__.py phe/variant/variant_factory.py phe/variant_filters/__init__.py test-data/1_short.vcf test-data/2_short.vcf test-data/testresult.fa tool_dependencies.xml vcfs2fasta.py vcfs2fasta.sh vcfs2fasta.xml |
b |
diff -r 7ac17b6d031e -r b09ffe50c378 LICENSE --- a/LICENSE Fri Dec 18 07:31:09 2015 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
b |
b'@@ -1,676 +0,0 @@\n-\n-\n- GNU GENERAL PUBLIC LICENSE\n- Version 3, 29 June 2007\n-\n- Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>\n- Everyone is permitted to copy and distribute verbatim copies\n- of this license document, but changing it is not allowed.\n-\n- Preamble\n-\n- The GNU General Public License is a free, copyleft license for\n-software and other kinds of works.\n-\n- The licenses for most software and other practical works are designed\n-to take away your freedom to share and change the works. By contrast,\n-the GNU General Public License is intended to guarantee your freedom to\n-share and change all versions of a program--to make sure it remains free\n-software for all its users. We, the Free Software Foundation, use the\n-GNU General Public License for most of our software; it applies also to\n-any other work released this way by its authors. You can apply it to\n-your programs, too.\n-\n- When we speak of free software, we are referring to freedom, not\n-price. Our General Public Licenses are designed to make sure that you\n-have the freedom to distribute copies of free software (and charge for\n-them if you wish), that you receive source code or can get it if you\n-want it, that you can change the software or use pieces of it in new\n-free programs, and that you know you can do these things.\n-\n- To protect your rights, we need to prevent others from denying you\n-these rights or asking you to surrender the rights. Therefore, you have\n-certain responsibilities if you distribute copies of the software, or if\n-you modify it: responsibilities to respect the freedom of others.\n-\n- For example, if you distribute copies of such a program, whether\n-gratis or for a fee, you must pass on to the recipients the same\n-freedoms that you received. You must make sure that they, too, receive\n-or can get the source code. And you must show them these terms so they\n-know their rights.\n-\n- Developers that use the GNU GPL protect your rights with two steps:\n-(1) assert copyright on the software, and (2) offer you this License\n-giving you legal permission to copy, distribute and/or modify it.\n-\n- For the developers\' and authors\' protection, the GPL clearly explains\n-that there is no warranty for this free software. For both users\' and\n-authors\' sake, the GPL requires that modified versions be marked as\n-changed, so that their problems will not be attributed erroneously to\n-authors of previous versions.\n-\n- Some devices are designed to deny users access to install or run\n-modified versions of the software inside them, although the manufacturer\n-can do so. This is fundamentally incompatible with the aim of\n-protecting users\' freedom to change the software. The systematic\n-pattern of such abuse occurs in the area of products for individuals to\n-use, which is precisely where it is most unacceptable. Therefore, we\n-have designed this version of the GPL to prohibit the practice for those\n-products. If such problems arise substantially in other domains, we\n-stand ready to extend this provision to those domains in future versions\n-of the GPL, as needed to protect the freedom of users.\n-\n- Finally, every program is threatened constantly by software patents.\n-States should not allow patents to restrict development and use of\n-software on general-purpose computers, but in those that do, we wish to\n-avoid the special danger that patents applied to a free program could\n-make it effectively proprietary. To prevent this, the GPL assures that\n-patents cannot be used to render the program non-free.\n-\n- The precise terms and conditions for copying, distribution and\n-modification follow.\n-\n- TERMS AND CONDITIONS\n-\n- 0. Definitions.\n-\n- "This License" refers to version 3 of the GNU General Public License.\n-\n- "Copyright" also means copyright-like laws that apply to other kinds of\n-works, such as semiconductor masks.\n-\n- "The Program" refers '..b'CE OF THE PROGRAM\n-IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF\n-ALL NECESSARY SERVICING, REPAIR OR CORRECTION.\n-\n- 16. Limitation of Liability.\n-\n- IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING\n-WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS\n-THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY\n-GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE\n-USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF\n-DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD\n-PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),\n-EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF\n-SUCH DAMAGES.\n-\n- 17. Interpretation of Sections 15 and 16.\n-\n- If the disclaimer of warranty and limitation of liability provided\n-above cannot be given local legal effect according to their terms,\n-reviewing courts shall apply local law that most closely approximates\n-an absolute waiver of all civil liability in connection with the\n-Program, unless a warranty or assumption of liability accompanies a\n-copy of the Program in return for a fee.\n-\n- END OF TERMS AND CONDITIONS\n-\n- How to Apply These Terms to Your New Programs\n-\n- If you develop a new program, and you want it to be of the greatest\n-possible use to the public, the best way to achieve this is to make it\n-free software which everyone can redistribute and change under these terms.\n-\n- To do so, attach the following notices to the program. It is safest\n-to attach them to the start of each source file to most effectively\n-state the exclusion of warranty; and each file should have at least\n-the "copyright" line and a pointer to where the full notice is found.\n-\n- {one line to give the program\'s name and a brief idea of what it does.}\n- Copyright (C) {year} {name of author}\n-\n- This program is free software: you can redistribute it and/or modify\n- it under the terms of the GNU General Public License as published by\n- the Free Software Foundation, either version 3 of the License, or\n- (at your option) any later version.\n-\n- This program is distributed in the hope that it will be useful,\n- but WITHOUT ANY WARRANTY; without even the implied warranty of\n- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n- GNU General Public License for more details.\n-\n- You should have received a copy of the GNU General Public License\n- along with this program. If not, see <http://www.gnu.org/licenses/>.\n-\n-Also add information on how to contact you by electronic and paper mail.\n-\n- If the program does terminal interaction, make it output a short\n-notice like this when it starts in an interactive mode:\n-\n- {project} Copyright (C) {year} {fullname}\n- This program comes with ABSOLUTELY NO WARRANTY; for details type `show w\'.\n- This is free software, and you are welcome to redistribute it\n- under certain conditions; type `show c\' for details.\n-\n-The hypothetical commands `show w\' and `show c\' should show the appropriate\n-parts of the General Public License. Of course, your program\'s commands\n-might be different; for a GUI interface, you would use an "about box".\n-\n- You should also get your employer (if you work as a programmer) or school,\n-if any, to sign a "copyright disclaimer" for the program, if necessary.\n-For more information on this, and how to apply and follow the GNU GPL, see\n-<http://www.gnu.org/licenses/>.\n-\n- The GNU General Public License does not permit incorporating your program\n-into proprietary programs. If your program is a subroutine library, you\n-may consider it more useful to permit linking proprietary applications with\n-the library. If this is what you want to do, use the GNU Lesser General\n-Public License instead of this License. But first, please read\n-<http://www.gnu.org/philosophy/why-not-lgpl.html>.\n' |
b |
diff -r 7ac17b6d031e -r b09ffe50c378 phe/__init__.py --- a/phe/__init__.py Fri Dec 18 07:31:09 2015 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
b |
@@ -1,4 +0,0 @@ -if __name__ == "phe": - # If this package is added as library, append extended path. - from pkgutil import extend_path - __path__ = extend_path(__path__, __name__) \ No newline at end of file |
b |
diff -r 7ac17b6d031e -r b09ffe50c378 phe/metadata/__init__.py --- a/phe/metadata/__init__.py Fri Dec 18 07:31:09 2015 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
b |
@@ -1,16 +0,0 @@ -"""Metadata related information.""" - -import abc - -class PHEMetaData(object): - """Abstract class to provide interface for meta-data creation.""" - - __metaclass__ = abc.ABCMeta - - def __init__(self): - pass - - @abc.abstractmethod - def get_meta(self): - """Get the metadata.""" - raise NotImplementedError("get meta has not been implemented yet.") |
b |
diff -r 7ac17b6d031e -r b09ffe50c378 phe/variant/GATKVariantCaller.py --- a/phe/variant/GATKVariantCaller.py Fri Dec 18 07:31:09 2015 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
[ |
@@ -1,126 +0,0 @@ -''' -Created on 22 Sep 2015 - -@author: alex -''' -from collections import OrderedDict -import logging -import os -import subprocess - -from phe.variant import VariantCaller - - -class GATKVariantCaller(VariantCaller): - """Implemetation of the Broad institute's variant caller.""" - - name = "gatk" - """Plain text name of the variant caller.""" - - _default_options = "--sample_ploidy 2 --genotype_likelihoods_model BOTH -rf BadCigar -out_mode EMIT_ALL_SITES -nt 1" - """Default options for the variant caller.""" - - def __init__(self, cmd_options=None): - """Constructor""" - if cmd_options is None: - cmd_options = self._default_options - - super(GATKVariantCaller, self).__init__(cmd_options=cmd_options) - - self.last_command = None - - def get_info(self, plain=False): - d = {"name": "gatk", "version": self.get_version(), "command": self.last_command} - - if plain: - result = "GATK(%(version)s): %(command)s" % d - else: - result = OrderedDict(d) - - return result - - def get_version(self): - - p = subprocess.Popen(["java", "-jar", os.environ["GATK_JAR"], "-version"], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) - (output, _) = p.communicate() - - # last character is EOL. - version = output.split("\n")[-2] - - return version - - def make_vcf(self, *args, **kwargs): - ref = kwargs.get("ref") - bam = kwargs.get("bam") - - if kwargs.get("vcf_file") is None: - kwargs["vcf_file"] = "variants.vcf" - - opts = {"ref": os.path.abspath(ref), - "bam": os.path.abspath(bam), - "gatk_jar": os.environ["GATK_JAR"], - "all_variants_file": os.path.abspath(kwargs.get("vcf_file")), - "extra_cmd_options": self.cmd_options} - -# if not self.create_aux_files(ref): -# logging.warn("Auxiliary files were not created.") -# return False - - # Call variants - # FIXME: Sample ploidy = 2? - os.environ["GATK_JAR"] - cmd = "java -XX:+UseSerialGC -jar %(gatk_jar)s -T UnifiedGenotyper -R %(ref)s -I %(bam)s -o %(all_variants_file)s %(extra_cmd_options)s" % opts - success = os.system(cmd) - - if success != 0: - logging.warn("Calling variants returned non-zero exit status.") - return False - - self.last_command = cmd - - return True - - def create_aux_files(self, ref): - """Create auxiliary files needed for this variant. - - Tools needed: samtools and picard tools. Picard tools is a Java - library that can be defined using environment variable: PICARD_JAR - specifying path to picard.jar or PICARD_TOOLS_PATH specifying path - to the directory where separate jars are (older version before jars - were merged into a single picard.jar). - Parameters: - ----------- - ref: str - Path to the reference file. - - Returns: - -------- - bool: - True if auxiliary files were created, False otherwise. - """ - - ref_name, _ = os.path.splitext(ref) - - success = os.system("samtools faidx %s" % ref) - - if success != 0: - logging.warn("Fasta index could not be created.") - return False - - d = {"ref": ref, "ref_name": ref_name} - - if os.environ.get("PICARD_TOOLS_PATH"): - d["picard_tools_path"] = os.path.join(os.environ["PICARD_TOOLS_PATH"], "CreateSequenceDictionary.jar") - elif os.environ.get("PICARD_JAR"): - # This is used in newer version of PICARD tool where multiple - # jars were merged into a single jar file. - d["picard_tools_path"] = "%s %s" % (os.environ["PICARD_JAR"], "CreateSequenceDictionary") - else: - logging.error("Picard tools are not present in the path.") - return False - - success = os.system("java -jar %(picard_tools_path)s R=%(ref)s O=%(ref_name)s.dict" % d) - - if success != 0: - logging.warn("Dictionary for the %s reference could not be created", ref) - return False |
b |
diff -r 7ac17b6d031e -r b09ffe50c378 phe/variant/MPileupVariantCaller.py --- a/phe/variant/MPileupVariantCaller.py Fri Dec 18 07:31:09 2015 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
[ |
@@ -1,96 +0,0 @@ -''' -Created on 22 Sep 2015 - -@author: alex -''' -from collections import OrderedDict -import logging -import os -import subprocess -import tempfile - -from phe.variant import VariantCaller - - -class MPileupVariantCaller(VariantCaller): - """Implemetation of the Broad institute's variant caller.""" - - name = "mpileup" - """Plain text name of the variant caller.""" - - _default_options = "-m -f GQ" - """Default options for the variant caller.""" - - def __init__(self, cmd_options=None): - """Constructor""" - if cmd_options is None: - cmd_options = self._default_options - - super(MPileupVariantCaller, self).__init__(cmd_options=cmd_options) - - self.last_command = None - - def get_info(self, plain=False): - d = {"name": self.name, "version": self.get_version(), "command": self.last_command} - - if plain: - result = "mpileup(%(version)s): %(command)s" % d - else: - result = OrderedDict(d) - - return result - - def get_version(self): - - p = subprocess.Popen(["samtools", "--version"], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) - (output, _) = p.communicate() - - # first line is the version of the samtools - version = output.split("\n")[0].split(" ")[1] - - return version - - def make_vcf(self, *args, **kwargs): - ref = kwargs.get("ref") - bam = kwargs.get("bam") - - if kwargs.get("vcf_file") is None: - kwargs["vcf_file"] = "variants.vcf" - - opts = {"ref": os.path.abspath(ref), - "bam": os.path.abspath(bam), - "all_variants_file": os.path.abspath(kwargs.get("vcf_file")), - "extra_cmd_options": self.cmd_options} - - with tempfile.NamedTemporaryFile(suffix=".pileup") as tmp: - opts["pileup_file"] = tmp.name - cmd = "samtools mpileup -t DP,DV,DP4,DPR,SP -Auf %(ref)s %(bam)s | bcftools call %(extra_cmd_options)s > %(all_variants_file)s" % opts - print cmd - self.last_command = cmd - if os.system(cmd) != 0: - logging.warn("Pileup creation was not successful.") - return False - - return True - - def create_aux_files(self, ref): - """Index reference with faidx from samtools. - - Parameters: - ----------- - ref: str - Path to the reference file. - - Returns: - -------- - bool: - True if auxiliary files were created, False otherwise. - """ - - success = os.system("samtools faidx %s" % ref) - - if success != 0: - logging.warn("Fasta index could not be created.") - return False - - return True |
b |
diff -r 7ac17b6d031e -r b09ffe50c378 phe/variant/__init__.py --- a/phe/variant/__init__.py Fri Dec 18 07:31:09 2015 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
[ |
b'@@ -1,288 +0,0 @@\n-"""Classes and methods to work with variants and such."""\n-import abc\n-#ulf\n-# from collections import OrderedDict\n-try:\n- from collections import OrderedDict\n-except ImportError:\n- from ordereddict import OrderedDict\n- \n-import logging\n-import pickle\n-\n-from vcf import filters\n-import vcf\n-from vcf.parser import _Filter\n-\n-from phe.metadata import PHEMetaData\n-from phe.variant_filters import make_filters, PHEFilterBase, str_to_filters\n-\n-\n-class VCFTemplate(object):\n- """This is a small hack class for the Template used in generating\n- VCF file."""\n-\n- def __init__(self, vcf_reader):\n- self.infos = vcf_reader.infos\n- self.formats = vcf_reader.formats\n- self.filters = vcf_reader.filters\n- self.alts = vcf_reader.alts\n- self.contigs = vcf_reader.contigs\n- self.metadata = vcf_reader.metadata\n- self._column_headers = vcf_reader._column_headers\n- self.samples = vcf_reader.samples\n-\n-class VariantSet(object):\n- """A convenient representation of set of variants.\n- TODO: Implement iterator and generator for the variant set.\n- """\n-\n- _reader = None\n-\n- def __init__(self, vcf_in, filters=None):\n- """Constructor of variant set.\n-\n- Parameters:\n- -----------\n- vcf_in: str\n- Path to the VCF file for loading information.\n- filters: str or dict, optional\n- Dictionary or string of the filter:threshold key value pairs.\n- """\n- self.vcf_in = vcf_in\n- self._reader = vcf.Reader(filename=vcf_in)\n- self.out_template = VCFTemplate(self._reader)\n-\n- self.filters = []\n- if filters is not None:\n- if isinstance(filters, str):\n- self.filters = str_to_filters(filters)\n- elif isinstance(filters, dict):\n- self.filters = make_filters(config=filters)\n- elif isinstance(filters, list):\n- self.filters = filters\n- else:\n- logging.warn("Could not create filters from %s", filters)\n- else:\n- reader = vcf.Reader(filename=self.vcf_in)\n- filters = {}\n- for filter_id in reader.filters:\n- filters.update(PHEFilterBase.decode(filter_id))\n-\n- if filters:\n- self.filters = make_filters(config=filters)\n-\n- self.variants = []\n-\n- def filter_variants(self, keep_only_snps=True):\n- """Create a variant """\n-\n- if self._reader is None:\n- # Create a reader class from input VCF.\n- self._reader = vcf.Reader(filename=self.vcf_in)\n-\n- # get list of existing filters.\n- existing_filters = {}\n- removed_filters = []\n-\n- for filter_id in self._reader.filters:\n- conf = PHEFilterBase.decode(filter_id)\n- tuple(conf.keys())\n- existing_filters.update({tuple(conf.keys()):filter_id})\n-\n- # Add each filter we are going to use to the record.\n- # This is needed for writing out proper #FILTER header in VCF.\n- for record_filter in self.filters:\n- # We know that each filter has short description method.\n- short_doc = record_filter.short_desc()\n- short_doc = short_doc.split(\'\\n\')[0].lstrip()\n-\n- filter_name = PHEFilterBase.decode(record_filter.filter_name())\n-\n- # Check if the sample has been filtered for this type of filter\n- # in the past. If so remove is, because it is going to be refiltered.\n- if tuple(filter_name) in existing_filters:\n- logging.info("Removing existing filter: %s", existing_filters[tuple(filter_name)])\n- removed_filters.append(existing_filters[tuple(filter_name)])\n- del self._reader.filters[existing_filters[tuple(filter_name)]]\n-\n- self._reader.filters[record_filter.filter_name()] = _Filter(record_filter.filter_name(), short_doc)\n-\n- '..b' (default: False).\n-\n- Returns:\n- int:\n- Number of records written.\n- """\n- written_variants = 0\n- with open(vcf_out, "w") as out_vcf:\n- writer = vcf.Writer(out_vcf, self.out_template)\n- for record in self.variants:\n-\n- if only_snps and not record.is_snp:\n- continue\n-\n- if only_good and record.FILTER != "PASS" or record.FILTER is None:\n- continue\n-\n- writer.write_record(record)\n- written_variants += 1\n-\n- return written_variants\n-\n- def _write_bad_variants(self, vcf_out):\n- """**PRIVATE:** Write only those records that **haven\'t** passed."""\n- written_variants = 0\n- with open(vcf_out, "w") as out_vcf:\n- writer = vcf.Writer(out_vcf, self.out_template)\n- for record in self.variants:\n- if record.FILTER != "PASS" and record.FILTER is not None:\n- writer.write_record(record)\n- written_variants += 1\n- return written_variants\n-\n- def serialise(self, out_file):\n- """Save the data in this class to a file for future use/reload.\n-\n- Parameters:\n- -----------\n- out_file: str\n- path to file where the data should be written to.\n-\n- Returns:\n- --------\n- int:\n- Number of variants written.\n- """\n- written_variants = 0\n- with open(out_file, "w") as out_vcf:\n- writer = vcf.Writer(out_vcf, self.out_template)\n- for record in self.variants:\n- writer.write_record(record)\n- written_variants += 1\n-\n- return written_variants\n-\n- def update_filters(self, new_filters):\n- """Update internal filters in the output template."""\n- for new_filter, filter_data in new_filters.items():\n- self.out_template.filters[new_filter] = filter_data\n-\n-\n-class VariantCaller(PHEMetaData):\n- """Abstract class used for access to the implemented variant callers."""\n-\n- __metaclass__ = abc.ABCMeta\n-\n- def __init__(self, cmd_options=None):\n- """Constructor for variant caller.\n-\n- Parameters:\n- -----------\n- cmd_options: str, optional\n- Command options to pass to the variant command.\n- """\n- self.cmd_options = cmd_options\n-\n- super(VariantCaller, self).__init__()\n-\n- @abc.abstractmethod\n- def make_vcf(self, *args, **kwargs):\n- """Create a VCF from **BAM** file.\n-\n- Parameters:\n- -----------\n- ref: str\n- Path to the reference file.\n- bam: str\n- Path to the indexed **BAM** file for calling variants.\n- vcf_file: str\n- path to the VCF file where data will be written to.\n-\n- Returns:\n- --------\n- bool:\n- True if variant calling was successful, False otherwise.\n- """\n- raise NotImplementedError("make_vcf is not implemented yet.")\n-\n- @abc.abstractmethod\n- def create_aux_files(self, ref):\n- """Create needed (if any) auxiliary files.\n- These files are required for proper functioning of the variant caller.\n- """\n- raise NotImplementedError("create_aux_files is not implemeted.")\n-\n- @abc.abstractmethod\n- def get_info(self, plain=False):\n- """Get information about this variant caller."""\n- raise NotImplementedError("Get info has not been implemented yet."\n- )\n- def get_meta(self):\n- """Get the metadata about this variant caller."""\n- od = self.get_info()\n- od["ID"] = "VariantCaller"\n- return OrderedDict({"PHEVariantMetaData": [od]})\n-\n- @abc.abstractmethod\n- def get_version(self):\n- """Get the version of the underlying command used."""\n- raise NotImplementedError("Get version has not been implemented yet.")\n' |
b |
diff -r 7ac17b6d031e -r b09ffe50c378 phe/variant/variant_factory.py --- a/phe/variant/variant_factory.py Fri Dec 18 07:31:09 2015 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
[ |
@@ -1,92 +0,0 @@ -'''Classes and functions for working with variant callers. - -Created on 22 Sep 2015 - -@author: alex -''' -import glob -import inspect -import logging -import os -import sys - -from phe.variant import VariantCaller - -def dynamic_caller_loader(): - """Fancy way of dynamically importing existing variant callers. - - Returns - ------- - dict: - Available variant callers dictionary. Keys are parameters that - can be used to call variants. - """ - - # We assume the caller are in the same directory as THIS file. - variants_dir = os.path.dirname(__file__) - variants_dir = os.path.abspath(variants_dir) - - # This is populated when the module is first imported. - avail_callers = {} - - # Add this directory to the syspath. - sys.path.insert(0, variants_dir) - - # Find all "py" files. - for caller_mod in glob.glob(os.path.join(variants_dir, "*.py")): - - # Derive name of the module where caller is. - caller_mod_file = os.path.basename(caller_mod) - - # Ignore __init__ file, only base class is there. - if caller_mod_file.startswith("__init__"): - continue - - # Import the module with a caller. - mod = __import__(caller_mod_file.replace(".pyc", "").replace(".py", "")) - - # Find all the classes contained in this module. - classes = inspect.getmembers(mod, inspect.isclass) - for cls_name, cls in classes: - # For each class, if it is a sublass of VariantCaller, add it. - if cls_name != "VariantCaller" and issubclass(cls, VariantCaller): - # The name is inherited and defined within each caller. - avail_callers[cls.name] = cls - - sys.path.remove(variants_dir) - - return avail_callers - -_avail_variant_callers = dynamic_caller_loader() - -def available_callers(): - """Return list of available variant callers.""" - return _avail_variant_callers.keys() - -def factory(variant=None, custom_options=None): - """Make an instance of a variant class. - - Parameters: - ----------- - variant: str, optional - Name of the variant class to instantiate. - custom_options: str, optional - Custom options to be passed directly to the implementing class. - - Returns: - -------- - :py:class:`phe.variant.VariantCaller`: - Instance of the :py:class:`phe.variant.VariantCaller` for given - variant name, or None if one couldn't be found. - """ - if variant is not None and isinstance(variant, str): - - variant = variant.lower() - if variant in _avail_variant_callers: - return _avail_variant_callers[variant](cmd_options=custom_options) - else: - logging.error("No implementation for %s mapper.") - return None - - logging.warn("Unknown parameters. Mapper could not be initialised.") - return None |
b |
diff -r 7ac17b6d031e -r b09ffe50c378 phe/variant_filters/__init__.py --- a/phe/variant_filters/__init__.py Fri Dec 18 07:31:09 2015 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
[ |
@@ -1,209 +0,0 @@ -"""Classes and functions for working with variant filters.""" - -from __builtin__ import __import__ -from abc import abstractproperty -import abc -import argparse -import glob -import inspect -import logging -import os -import re -import sys - -import vcf -import vcf.filters -from vcf.parser import _Filter - -IUPAC_CODES = {frozenset(["A", "G"]): "R", - frozenset(["C", "T"]): "Y", - frozenset(["G", "C"]): "S", - frozenset(["A", "T"]): "W", - frozenset(["G", "T"]): "K", - frozenset(["A", "C"]): "M", - frozenset(["C", "G", "T"]): "B", - frozenset(["A", "G", "T"]): "D", - frozenset(["A", "C", "T"]): "H", - frozenset(["A", "C", "G"]): "V" - } - -class PHEFilterBase(vcf.filters.Base): - """Base class for VCF filters.""" - __meta__ = abc.ABCMeta - - magic_sep = ":" - decoder_pattern = re.compile(magic_sep) - - @abc.abstractproperty - def parameter(self): - """Short name of parameter being filtered.""" - return self.parameter - - @abc.abstractproperty - def _default_threshold(self): - """Default threshold for filtering.""" - return self._default_threshold - - def __init__(self, args): - super(PHEFilterBase, self).__init__(args) - - # Change the threshold to custom gq value. - self.threshold = self._default_threshold - - if isinstance(args, dict): - self.threshold = args.get(self.parameter) - - def __str__(self): - return self.filter_name() - - @abc.abstractmethod - def short_desc(self): - """Short description of the filter (included in VCF).""" - raise NotImplementedError("Get short description is not implemented.") - - def get_config(self): - """This is used for reconstructing filter.""" - return {self.parameter: self.threshold} - - def filter_name(self): - """Create filter names by their parameter separated by magic. - E.g. if filter parameter is ad_ratio and threshold is 0.9 then - ad_ratio:0.9 if the filter name. - """ - return "%s%s%s" % (self.parameter, self.magic_sep, self.threshold) - - @staticmethod - def decode(filter_id): - """Decode name of filter.""" - conf = {} - - if PHEFilterBase.magic_sep in filter_id: - info = PHEFilterBase.decoder_pattern.split(filter_id) - assert len(info) == 2 - conf[info[0]] = info[1] - return conf - - def is_gap(self): - return False - - def is_n(self): - return True - - @staticmethod - def call_concensus(record): - extended_code = "N" - try: - sample_ad = set([str(c) for c in record.ALT] + [record.REF]) - - - for code, cov in IUPAC_CODES.items(): - if sample_ad == cov: - extended_code = code - break - except AttributeError: - extended_code = "N" - - return extended_code - -def dynamic_filter_loader(): - """Fancy way of dynamically importing existing filters. - - Returns - ------- - dict: - Available filters dictionary. Keys are parameters that - can be supplied to the filters. - """ - - # We assume the filters are in the same directory as THIS file. - filter_dir = os.path.dirname(__file__) - filter_dir = os.path.abspath(filter_dir) - - # This is populated when the module is first imported. - avail_filters = {} - - # Add this directory to the syspath. - sys.path.insert(0, filter_dir) - - # Find all "py" files. - for filter_mod in glob.glob(os.path.join(filter_dir, "*.py")): - - # Derive name of the module where filter is. - filter_mod_file = os.path.basename(filter_mod) - - # Ignore this file, obviously. - if filter_mod_file.startswith("__init__"): - continue - - # Import the module with a filter. - mod = __import__(filter_mod_file.replace(".pyc", "").replace(".py", "")) - - # Find all the classes contained in this module. - classes = inspect.getmembers(mod, inspect.isclass) - for cls_name, cls in classes: - # For each class, if it is a sublass of PHEFilterBase, add it. - if cls_name != "PHEFilterBase" and issubclass(cls, PHEFilterBase): - # The parameters are inherited and defined within each filter. - avail_filters[cls.parameter] = cls - - sys.path.remove(filter_dir) - - return avail_filters - -_avail_filters = dynamic_filter_loader() - -def available_filters(): - """Return list of available filters.""" - return _avail_filters.keys() - -def str_to_filters(filters): - """Convert from filter string to array of filters. - E.g. ad_ration:0.9,min_depth:5 - - Parameters: - ----------- - filters: str - String version of filters, separated by comma. - - Returns: - -------- - list: - List of :py:class:`phe.variant_filters.PHEFilterBase` instances. - """ - - config = {} - for kv_pair in filters.split(","): - pair = kv_pair.split(":") - assert len(pair) == 2, "Filters should be separated by ':' %s" % kv_pair - - # We don't care about casting them to correct type because Filters - # will do it for us. - config[pair[0]] = pair[1] - - return make_filters(config) - -def make_filters(config): - """Create a list of filters from *config*. - - Parameters: - ----------- - config: dict, optional - Dictionary with parameter: value pairs. For each parameter, an - appropriate Filter will be found and instanciated. - - Returns: - -------- - list: - List of :py:class:`PHEFilterBase` filters. - """ - filters = [] - - if config: - for custom_filter in config: - if custom_filter in _avail_filters: - filters.append(_avail_filters[custom_filter](config)) - else: - logging.warn("Could not find appropriate filter for %s", - custom_filter) - - return filters |
b |
diff -r 7ac17b6d031e -r b09ffe50c378 test-data/1_short.vcf --- a/test-data/1_short.vcf Fri Dec 18 07:31:09 2015 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
[ |
b'@@ -1,50000 +0,0 @@\n-##fileformat=VCFv4.1\n-##FILTER=<ID=LowQual,Description="Low quality">\n-##FORMAT=<ID=AD,Number=.,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">\n-##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">\n-##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">\n-##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">\n-##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">\n-##GATKCommandLine=<ID=SelectVariants,Version=2.6-5-gba531bd,Date="Thu May 22 09:24:36 BST 2014",Epoch=1400747076888,CommandLineOptions="analysis_type=SelectVariants input_file=[] read_buffer_size=null phone_home=AWS gatk_key=null tag=NA read_filter=[] intervals=null excludeIntervals=null interval_set_rule=UNION interval_merging=ALL interval_padding=0 reference_sequence=/phengs/galaxy/database/tmp/tmp-gatk-lMh4vi/gatk_input.fasta nonDeterministicRandomSeed=false disableDithering=false maxRuntime=-1 maxRuntimeUnits=MINUTES downsampling_type=BY_SAMPLE downsample_to_fraction=null downsample_to_coverage=1000 baq=OFF baqGapOpenPenalty=40.0 fix_misencoded_quality_scores=false allow_potentially_misencoded_quality_scores=false useOriginalQualities=false defaultBaseQualities=-1 performanceLog=null BQSR=null quantize_quals=0 disable_indel_quals=false emit_original_quals=false preserve_qscores_less_than=6 globalQScorePrior=-1.0 allow_bqsr_on_reduced_bams_despite_repeated_warnings=false validation_strictness=SILENT remove_program_records=false keep_program_records=false unsafe=null disable_auto_index_creation_and_locking_when_reading_rods=false num_threads=2 num_cpu_threads_per_data_thread=1 num_io_threads=0 monitorThreadEfficiency=false num_bam_file_handles=null read_group_black_list=null pedigree=[] pedigreeString=[] pedigreeValidationType=STRICT allow_intervals_with_unindexed_bam=false generateShadowBCF=false logging_level=INFO log_to_file=null help=false version=false variant=(RodBinding name=variant source=/phengs/galaxy/database/tmp/tmp-gatk-lMh4vi/input_variant.vcf) discordance=(RodBinding name= source=UNBOUND) concordance=(RodBinding name= source=UNBOUND) out=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub no_cmdline_in_header=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub sites_only=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub bcf=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub sample_name=[] sample_expressions=null sample_file=null exclude_sample_name=[] exclude_sample_file=[] select_expressions=[] excludeNonVariants=false excludeFiltered=false restrictAllelesTo=ALL keepOriginalAC=false mendelianViolation=false mendelianViolationQualThreshold=0.0 select_random_fraction=0.0 remove_fraction_genotypes=0.0 selectTypeToInclude=[SNP] keepIDs=null fullyDecode=false forceGenotypesDecode=false justRead=false maxIndelSize=2147483647 ALLOW_NONOVERLAPPING_COMMAND_LINE_SAMPLES=false filter_reads_with_N_cigar=false filter_mismatching_base_and_quals=false filter_bases_not_stored=false">\n-##GATKCommandLine=<ID=UnifiedGenotyper,Version=2.6-5-gba531bd,Date="Thu May 22 09:10:24 BST 2014",Epoch=1400746224902,CommandLineOptions="analysis_type=UnifiedGenotyper input_file=[/phengs/galaxy/database/tmp/tmp-gatk-x7S0pF/gatk_input_0.bam] read_buffer_size=null phone_home=AWS gatk_key=null tag=NA read_filter=[BadCigar] intervals=null excludeIntervals=null interval_set_rule=UNION interval_merging=ALL interval_padding=0 reference_sequence=/phengs/galaxy/database/tmp/tmp-gatk-x7S0pF/gatk_input.fasta nonDeterministicRandomSeed=false disableDithering=false maxRuntime=-1 maxRuntimeUnits=MINUTES downsampling_type=NONE downsample_to_fraction=null downsample_to_coverage=null baq=OFF baqGapOpenPenalty=40.0 fix_misencoded_quality_scores=false allow_potentially_misencoded_quality_scores=false useOrigina'..b'.\tAN=2;DP=146;MQ=60.00;MQ0=0\tGT:DP\t0/0:146\n-gi|15829254|ref|NC_002695.1|\t49922\t.\tT\t.\t434.23\t.\tAN=2;DP=144;MQ=60.00;MQ0=0\tGT:DP\t0/0:144\n-gi|15829254|ref|NC_002695.1|\t49923\t.\tC\t.\t434.23\t.\tAN=2;DP=143;MQ=60.00;MQ0=0\tGT:DP\t0/0:143\n-gi|15829254|ref|NC_002695.1|\t49924\t.\tA\t.\t428.23\t.\tAN=2;DP=141;MQ=60.00;MQ0=0\tGT:DP\t0/0:141\n-gi|15829254|ref|NC_002695.1|\t49925\t.\tA\t.\t440.23\t.\tAN=2;DP=149;MQ=60.00;MQ0=0\tGT:DP\t0/0:149\n-gi|15829254|ref|NC_002695.1|\t49926\t.\tT\t.\t440.23\t.\tAN=2;DP=152;MQ=60.00;MQ0=0\tGT:DP\t0/0:152\n-gi|15829254|ref|NC_002695.1|\t49927\t.\tG\t.\t446.23\t.\tAN=2;DP=155;MQ=60.00;MQ0=0\tGT:DP\t0/0:155\n-gi|15829254|ref|NC_002695.1|\t49928\t.\tT\t.\t452.23\t.\tAN=2;DP=157;MQ=60.00;MQ0=0\tGT:DP\t0/0:157\n-gi|15829254|ref|NC_002695.1|\t49929\t.\tC\t.\t443.23\t.\tAN=2;DP=153;MQ=60.00;MQ0=0\tGT:DP\t0/0:153\n-gi|15829254|ref|NC_002695.1|\t49930\t.\tG\t.\t449.23\t.\tAN=2;DP=155;MQ=60.00;MQ0=0\tGT:DP\t0/0:155\n-gi|15829254|ref|NC_002695.1|\t49931\t.\tA\t.\t449.23\t.\tAN=2;DP=156;MQ=60.00;MQ0=0\tGT:DP\t0/0:156\n-gi|15829254|ref|NC_002695.1|\t49932\t.\tT\t.\t461.23\t.\tAN=2;DP=159;MQ=60.00;MQ0=0\tGT:DP\t0/0:159\n-gi|15829254|ref|NC_002695.1|\t49933\t.\tG\t.\t449.23\t.\tAN=2;DP=156;MQ=60.00;MQ0=0\tGT:DP\t0/0:156\n-gi|15829254|ref|NC_002695.1|\t49934\t.\tA\t.\t455.23\t.\tAN=2;DP=156;MQ=60.00;MQ0=0\tGT:DP\t0/0:156\n-gi|15829254|ref|NC_002695.1|\t49935\t.\tA\t.\t452.23\t.\tAN=2;DP=156;MQ=60.00;MQ0=0\tGT:DP\t0/0:156\n-gi|15829254|ref|NC_002695.1|\t49936\t.\tG\t.\t455.23\t.\tAN=2;DP=158;MQ=60.00;MQ0=0\tGT:DP\t0/0:157\n-gi|15829254|ref|NC_002695.1|\t49937\t.\tA\t.\t440.23\t.\tAN=2;DP=155;MQ=60.00;MQ0=0\tGT:DP\t0/0:155\n-gi|15829254|ref|NC_002695.1|\t49938\t.\tG\t.\t449.23\t.\tAN=2;DP=157;MQ=60.00;MQ0=0\tGT:DP\t0/0:157\n-gi|15829254|ref|NC_002695.1|\t49939\t.\tC\t.\t443.23\t.\tAN=2;DP=155;MQ=60.00;MQ0=0\tGT:DP\t0/0:155\n-gi|15829254|ref|NC_002695.1|\t49940\t.\tA\t.\t440.23\t.\tAN=2;DP=154;MQ=60.00;MQ0=0\tGT:DP\t0/0:154\n-gi|15829254|ref|NC_002695.1|\t49941\t.\tT\t.\t449.23\t.\tAN=2;DP=155;MQ=60.00;MQ0=0\tGT:DP\t0/0:155\n-gi|15829254|ref|NC_002695.1|\t49942\t.\tC\t.\t446.23\t.\tAN=2;DP=155;MQ=60.00;MQ0=0\tGT:DP\t0/0:155\n-gi|15829254|ref|NC_002695.1|\t49943\t.\tC\t.\t428.23\t.\tAN=2;DP=144;MQ=60.00;MQ0=0\tGT:DP\t0/0:144\n-gi|15829254|ref|NC_002695.1|\t49944\t.\tG\t.\t428.23\t.\tAN=2;DP=144;MQ=60.00;MQ0=0\tGT:DP\t0/0:144\n-gi|15829254|ref|NC_002695.1|\t49945\t.\tC\t.\t431.23\t.\tAN=2;DP=144;MQ=60.00;MQ0=0\tGT:DP\t0/0:144\n-gi|15829254|ref|NC_002695.1|\t49946\t.\tA\t.\t425.23\t.\tAN=2;DP=144;MQ=60.00;MQ0=0\tGT:DP\t0/0:144\n-gi|15829254|ref|NC_002695.1|\t49947\t.\tC\t.\t431.23\t.\tAN=2;DP=145;MQ=60.00;MQ0=0\tGT:DP\t0/0:145\n-gi|15829254|ref|NC_002695.1|\t49948\t.\tA\t.\t404.23\t.\tAN=2;DP=137;MQ=60.00;MQ0=0\tGT:DP\t0/0:137\n-gi|15829254|ref|NC_002695.1|\t49949\t.\tT\t.\t425.23\t.\tAN=2;DP=144;MQ=60.00;MQ0=0\tGT:DP\t0/0:143\n-gi|15829254|ref|NC_002695.1|\t49950\t.\tT\t.\t419.23\t.\tAN=2;DP=144;MQ=60.00;MQ0=0\tGT:DP\t0/0:144\n-gi|15829254|ref|NC_002695.1|\t49951\t.\tG\t.\t431.23\t.\tAN=2;DP=146;MQ=60.00;MQ0=0\tGT:DP\t0/0:146\n-gi|15829254|ref|NC_002695.1|\t49952\t.\tT\t.\t437.23\t.\tAN=2;DP=146;MQ=60.00;MQ0=0\tGT:DP\t0/0:146\n-gi|15829254|ref|NC_002695.1|\t49953\t.\tT\t.\t428.23\t.\tAN=2;DP=144;MQ=60.00;MQ0=0\tGT:DP\t0/0:144\n-gi|15829254|ref|NC_002695.1|\t49954\t.\tG\t.\t434.23\t.\tAN=2;DP=146;MQ=60.00;MQ0=0\tGT:DP\t0/0:146\n-gi|15829254|ref|NC_002695.1|\t49955\t.\tT\t.\t428.23\t.\tAN=2;DP=146;MQ=60.00;MQ0=0\tGT:DP\t0/0:146\n-gi|15829254|ref|NC_002695.1|\t49956\t.\tG\t.\t434.23\t.\tAN=2;DP=146;MQ=60.00;MQ0=0\tGT:DP\t0/0:146\n-gi|15829254|ref|NC_002695.1|\t49957\t.\tA\t.\t428.23\t.\tAN=2;DP=144;MQ=60.00;MQ0=0\tGT:DP\t0/0:144\n-gi|15829254|ref|NC_002695.1|\t49958\t.\tA\t.\t416.23\t.\tAN=2;DP=140;MQ=60.00;MQ0=0\tGT:DP\t0/0:140\n-gi|15829254|ref|NC_002695.1|\t49959\t.\tA\t.\t407.23\t.\tAN=2;DP=138;MQ=60.00;MQ0=0\tGT:DP\t0/0:138\n-gi|15829254|ref|NC_002695.1|\t49960\t.\tG\t.\t395.23\t.\tAN=2;DP=136;MQ=60.00;MQ0=0\tGT:DP\t0/0:136\n-gi|15829254|ref|NC_002695.1|\t49961\t.\tC\t.\t404.23\t.\tAN=2;DP=136;MQ=60.00;MQ0=0\tGT:DP\t0/0:136\n-gi|15829254|ref|NC_002695.1|\t49962\t.\tC\t.\t392.23\t.\tAN=2;DP=134;MQ=60.00;MQ0=0\tGT:DP\t0/0:134\n-gi|15829254|ref|NC_002695.1|\t49963\t.\tG\t.\t398.23\t.\tAN=2;DP=134;MQ=60.00;MQ0=0\tGT:DP\t0/0:134\n-gi|15829254|ref|NC_002695.1|\t49964\t.\tA\t.\t398.23\t.\tAN=2;DP=136;MQ=60.00;MQ0=0\tGT:DP\t0/0:136\n' |
b |
diff -r 7ac17b6d031e -r b09ffe50c378 test-data/2_short.vcf --- a/test-data/2_short.vcf Fri Dec 18 07:31:09 2015 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
[ |
b'@@ -1,50000 +0,0 @@\n-##fileformat=VCFv4.1\n-##FILTER=<ID=LowQual,Description="Low quality">\n-##FORMAT=<ID=AD,Number=.,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">\n-##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">\n-##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">\n-##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">\n-##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">\n-##GATKCommandLine=<ID=SelectVariants,Version=2.6-5-gba531bd,Date="Thu May 22 09:15:54 BST 2014",Epoch=1400746554393,CommandLineOptions="analysis_type=SelectVariants input_file=[] read_buffer_size=null phone_home=AWS gatk_key=null tag=NA read_filter=[] intervals=null excludeIntervals=null interval_set_rule=UNION interval_merging=ALL interval_padding=0 reference_sequence=/phengs/galaxy/database/tmp/tmp-gatk-4mpB9s/gatk_input.fasta nonDeterministicRandomSeed=false disableDithering=false maxRuntime=-1 maxRuntimeUnits=MINUTES downsampling_type=BY_SAMPLE downsample_to_fraction=null downsample_to_coverage=1000 baq=OFF baqGapOpenPenalty=40.0 fix_misencoded_quality_scores=false allow_potentially_misencoded_quality_scores=false useOriginalQualities=false defaultBaseQualities=-1 performanceLog=null BQSR=null quantize_quals=0 disable_indel_quals=false emit_original_quals=false preserve_qscores_less_than=6 globalQScorePrior=-1.0 allow_bqsr_on_reduced_bams_despite_repeated_warnings=false validation_strictness=SILENT remove_program_records=false keep_program_records=false unsafe=null disable_auto_index_creation_and_locking_when_reading_rods=false num_threads=2 num_cpu_threads_per_data_thread=1 num_io_threads=0 monitorThreadEfficiency=false num_bam_file_handles=null read_group_black_list=null pedigree=[] pedigreeString=[] pedigreeValidationType=STRICT allow_intervals_with_unindexed_bam=false generateShadowBCF=false logging_level=INFO log_to_file=null help=false version=false variant=(RodBinding name=variant source=/phengs/galaxy/database/tmp/tmp-gatk-4mpB9s/input_variant.vcf) discordance=(RodBinding name= source=UNBOUND) concordance=(RodBinding name= source=UNBOUND) out=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub no_cmdline_in_header=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub sites_only=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub bcf=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub sample_name=[] sample_expressions=null sample_file=null exclude_sample_name=[] exclude_sample_file=[] select_expressions=[] excludeNonVariants=false excludeFiltered=false restrictAllelesTo=ALL keepOriginalAC=false mendelianViolation=false mendelianViolationQualThreshold=0.0 select_random_fraction=0.0 remove_fraction_genotypes=0.0 selectTypeToInclude=[SNP] keepIDs=null fullyDecode=false forceGenotypesDecode=false justRead=false maxIndelSize=2147483647 ALLOW_NONOVERLAPPING_COMMAND_LINE_SAMPLES=false filter_reads_with_N_cigar=false filter_mismatching_base_and_quals=false filter_bases_not_stored=false">\n-##GATKCommandLine=<ID=UnifiedGenotyper,Version=2.6-5-gba531bd,Date="Thu May 22 09:06:54 BST 2014",Epoch=1400746014908,CommandLineOptions="analysis_type=UnifiedGenotyper input_file=[/phengs/galaxy/database/tmp/tmp-gatk-Gz3f1A/gatk_input_0.bam] read_buffer_size=null phone_home=AWS gatk_key=null tag=NA read_filter=[BadCigar] intervals=null excludeIntervals=null interval_set_rule=UNION interval_merging=ALL interval_padding=0 reference_sequence=/phengs/galaxy/database/tmp/tmp-gatk-Gz3f1A/gatk_input.fasta nonDeterministicRandomSeed=false disableDithering=false maxRuntime=-1 maxRuntimeUnits=MINUTES downsampling_type=NONE downsample_to_fraction=null downsample_to_coverage=null baq=OFF baqGapOpenPenalty=40.0 fix_misencoded_quality_scores=false allow_potentially_misencoded_quality_scores=false useOrigina'..b'AN=2;DP=48;MQ=60.00;MQ0=0\tGT:DP\t0/0:48\n-gi|15829254|ref|NC_002695.1|\t49921\t.\tT\t.\t157.23\t.\tAN=2;DP=49;MQ=60.00;MQ0=0\tGT:DP\t0/0:49\n-gi|15829254|ref|NC_002695.1|\t49922\t.\tT\t.\t157.23\t.\tAN=2;DP=48;MQ=60.00;MQ0=0\tGT:DP\t0/0:48\n-gi|15829254|ref|NC_002695.1|\t49923\t.\tC\t.\t154.23\t.\tAN=2;DP=47;MQ=60.00;MQ0=0\tGT:DP\t0/0:47\n-gi|15829254|ref|NC_002695.1|\t49924\t.\tA\t.\t151.23\t.\tAN=2;DP=45;MQ=60.00;MQ0=0\tGT:DP\t0/0:45\n-gi|15829254|ref|NC_002695.1|\t49925\t.\tA\t.\t163.23\t.\tAN=2;DP=50;MQ=60.00;MQ0=0\tGT:DP\t0/0:50\n-gi|15829254|ref|NC_002695.1|\t49926\t.\tT\t.\t160.23\t.\tAN=2;DP=50;MQ=60.00;MQ0=0\tGT:DP\t0/0:50\n-gi|15829254|ref|NC_002695.1|\t49927\t.\tG\t.\t160.23\t.\tAN=2;DP=51;MQ=60.00;MQ0=0\tGT:DP\t0/0:51\n-gi|15829254|ref|NC_002695.1|\t49928\t.\tT\t.\t160.23\t.\tAN=2;DP=52;MQ=60.00;MQ0=0\tGT:DP\t0/0:51\n-gi|15829254|ref|NC_002695.1|\t49929\t.\tC\t.\t160.23\t.\tAN=2;DP=51;MQ=60.00;MQ0=0\tGT:DP\t0/0:51\n-gi|15829254|ref|NC_002695.1|\t49930\t.\tG\t.\t163.23\t.\tAN=2;DP=53;MQ=60.00;MQ0=0\tGT:DP\t0/0:53\n-gi|15829254|ref|NC_002695.1|\t49931\t.\tA\t.\t166.23\t.\tAN=2;DP=54;MQ=60.00;MQ0=0\tGT:DP\t0/0:54\n-gi|15829254|ref|NC_002695.1|\t49932\t.\tT\t.\t169.23\t.\tAN=2;DP=55;MQ=60.00;MQ0=0\tGT:DP\t0/0:55\n-gi|15829254|ref|NC_002695.1|\t49933\t.\tG\t.\t163.23\t.\tAN=2;DP=51;MQ=60.00;MQ0=0\tGT:DP\t0/0:50\n-gi|15829254|ref|NC_002695.1|\t49934\t.\tA\t.\t166.23\t.\tAN=2;DP=51;MQ=60.00;MQ0=0\tGT:DP\t0/0:51\n-gi|15829254|ref|NC_002695.1|\t49935\t.\tA\t.\t169.23\t.\tAN=2;DP=51;MQ=60.00;MQ0=0\tGT:DP\t0/0:51\n-gi|15829254|ref|NC_002695.1|\t49936\t.\tG\t.\t169.23\t.\tAN=2;DP=51;MQ=60.00;MQ0=0\tGT:DP\t0/0:51\n-gi|15829254|ref|NC_002695.1|\t49937\t.\tA\t.\t163.23\t.\tAN=2;DP=50;MQ=60.00;MQ0=0\tGT:DP\t0/0:50\n-gi|15829254|ref|NC_002695.1|\t49938\t.\tG\t.\t160.23\t.\tAN=2;DP=50;MQ=60.00;MQ0=0\tGT:DP\t0/0:50\n-gi|15829254|ref|NC_002695.1|\t49939\t.\tC\t.\t154.23\t.\tAN=2;DP=47;MQ=60.00;MQ0=0\tGT:DP\t0/0:47\n-gi|15829254|ref|NC_002695.1|\t49940\t.\tA\t.\t151.23\t.\tAN=2;DP=48;MQ=60.00;MQ0=0\tGT:DP\t0/0:48\n-gi|15829254|ref|NC_002695.1|\t49941\t.\tT\t.\t157.23\t.\tAN=2;DP=49;MQ=60.00;MQ0=0\tGT:DP\t0/0:49\n-gi|15829254|ref|NC_002695.1|\t49942\t.\tC\t.\t160.23\t.\tAN=2;DP=50;MQ=60.00;MQ0=0\tGT:DP\t0/0:50\n-gi|15829254|ref|NC_002695.1|\t49943\t.\tC\t.\t157.23\t.\tAN=2;DP=47;MQ=60.00;MQ0=0\tGT:DP\t0/0:47\n-gi|15829254|ref|NC_002695.1|\t49944\t.\tG\t.\t157.23\t.\tAN=2;DP=48;MQ=60.00;MQ0=0\tGT:DP\t0/0:48\n-gi|15829254|ref|NC_002695.1|\t49945\t.\tC\t.\t154.23\t.\tAN=2;DP=47;MQ=60.00;MQ0=0\tGT:DP\t0/0:47\n-gi|15829254|ref|NC_002695.1|\t49946\t.\tA\t.\t154.23\t.\tAN=2;DP=48;MQ=60.00;MQ0=0\tGT:DP\t0/0:48\n-gi|15829254|ref|NC_002695.1|\t49947\t.\tC\t.\t154.23\t.\tAN=2;DP=48;MQ=60.00;MQ0=0\tGT:DP\t0/0:48\n-gi|15829254|ref|NC_002695.1|\t49948\t.\tA\t.\t151.23\t.\tAN=2;DP=46;MQ=60.00;MQ0=0\tGT:DP\t0/0:46\n-gi|15829254|ref|NC_002695.1|\t49949\t.\tT\t.\t151.23\t.\tAN=2;DP=46;MQ=60.00;MQ0=0\tGT:DP\t0/0:46\n-gi|15829254|ref|NC_002695.1|\t49950\t.\tT\t.\t148.23\t.\tAN=2;DP=46;MQ=60.00;MQ0=0\tGT:DP\t0/0:46\n-gi|15829254|ref|NC_002695.1|\t49951\t.\tG\t.\t154.23\t.\tAN=2;DP=49;MQ=60.00;MQ0=0\tGT:DP\t0/0:49\n-gi|15829254|ref|NC_002695.1|\t49952\t.\tT\t.\t157.23\t.\tAN=2;DP=49;MQ=60.00;MQ0=0\tGT:DP\t0/0:49\n-gi|15829254|ref|NC_002695.1|\t49953\t.\tT\t.\t154.23\t.\tAN=2;DP=49;MQ=60.00;MQ0=0\tGT:DP\t0/0:49\n-gi|15829254|ref|NC_002695.1|\t49954\t.\tG\t.\t157.23\t.\tAN=2;DP=48;MQ=60.00;MQ0=0\tGT:DP\t0/0:48\n-gi|15829254|ref|NC_002695.1|\t49955\t.\tT\t.\t151.23\t.\tAN=2;DP=48;MQ=60.00;MQ0=0\tGT:DP\t0/0:48\n-gi|15829254|ref|NC_002695.1|\t49956\t.\tG\t.\t154.23\t.\tAN=2;DP=47;MQ=60.00;MQ0=0\tGT:DP\t0/0:47\n-gi|15829254|ref|NC_002695.1|\t49957\t.\tA\t.\t151.23\t.\tAN=2;DP=47;MQ=60.00;MQ0=0\tGT:DP\t0/0:47\n-gi|15829254|ref|NC_002695.1|\t49958\t.\tA\t.\t148.23\t.\tAN=2;DP=47;MQ=60.00;MQ0=0\tGT:DP\t0/0:47\n-gi|15829254|ref|NC_002695.1|\t49959\t.\tA\t.\t148.23\t.\tAN=2;DP=47;MQ=60.00;MQ0=0\tGT:DP\t0/0:47\n-gi|15829254|ref|NC_002695.1|\t49960\t.\tG\t.\t145.23\t.\tAN=2;DP=46;MQ=60.00;MQ0=0\tGT:DP\t0/0:46\n-gi|15829254|ref|NC_002695.1|\t49961\t.\tC\t.\t145.23\t.\tAN=2;DP=46;MQ=60.00;MQ0=0\tGT:DP\t0/0:46\n-gi|15829254|ref|NC_002695.1|\t49962\t.\tC\t.\t148.23\t.\tAN=2;DP=46;MQ=60.00;MQ0=0\tGT:DP\t0/0:46\n-gi|15829254|ref|NC_002695.1|\t49963\t.\tG\t.\t145.23\t.\tAN=2;DP=46;MQ=60.00;MQ0=0\tGT:DP\t0/0:46\n-gi|15829254|ref|NC_002695.1|\t49964\t.\tA\t.\t145.23\t.\tAN=2;DP=46;MQ=60.00;MQ0=0\tGT:DP\t0/0:46\n' |
b |
diff -r 7ac17b6d031e -r b09ffe50c378 test-data/testresult.fa --- a/test-data/testresult.fa Fri Dec 18 07:31:09 2015 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
b |
@@ -1,6 +0,0 @@ ->1_short -TTCTTCATGA ->2_short -TTTTACACGA ->reference -CCCATAGCAG |
b |
diff -r 7ac17b6d031e -r b09ffe50c378 tool_dependencies.xml --- a/tool_dependencies.xml Fri Dec 18 07:31:09 2015 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
b |
@@ -1,21 +0,0 @@ -<?xml version="1.0"?> -<tool_dependency> - <package name="python" version="2.7.10"> - <repository changeset_revision="0339c4a9b87b" name="package_python_2_7_10" owner="iuc" prior_installation_required="True" toolshed="https://toolshed.g2.bx.psu.edu" /> - </package> - <package name="pyvcf" version="0.6.8dev"> - <repository changeset_revision="7ef691e979b5" name="package_python_2_7_pyvcf_0_6_8dev" owner="ulfschaefer" prior_installation_required="True" toolshed="https://toolshed.g2.bx.psu.edu"/> - </package> - <package name="pyyaml" version="3.11"> - <repository changeset_revision="99267d131c05" name="package_python_2_7_pyyaml_3_11" owner="iuc" prior_installation_required="True" toolshed="https://toolshed.g2.bx.psu.edu"/> - </package> - <package name="bintrees" version="2.0.2"> - <repository changeset_revision="1d94386c45bc" name="package_python_2_7_bintrees_2_0_2" owner="ulfschaefer" prior_installation_required="True" toolshed="https://toolshed.g2.bx.psu.edu"/> - </package> - <package name="biopython" version="1.66"> - <repository changeset_revision="5d5355863287" name="package_python_2_7_biopython_1_66" owner="ulfschaefer" prior_installation_required="True" toolshed="https://toolshed.g2.bx.psu.edu"/> - </package> - <package name="matplotlib" version="1.4"> - <repository changeset_revision="f7424e1cf115" name="package_python_2_7_matplotlib_1_4" owner="iuc" prior_installation_required="True" toolshed="https://toolshed.g2.bx.psu.edu"/> - </package> -</tool_dependency> |
b |
diff -r 7ac17b6d031e -r b09ffe50c378 vcfs2fasta.py --- a/vcfs2fasta.py Fri Dec 18 07:31:09 2015 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
[ |
b'@@ -1,415 +0,0 @@\n-#!/usr/bin/env python\n-\'\'\'\n-Merge SNP data from multiple VCF files into a single fasta file.\n-\n-Created on 5 Oct 2015\n-\n-@author: alex\n-\'\'\'\n-import argparse\n-from collections import OrderedDict\n-import glob\n-import itertools\n-import logging\n-import os\n-\n-from Bio import SeqIO\n-from bintrees import FastRBTree\n-\n-# Try importing the matplotlib and numpy for stats.\n-try:\n- from matplotlib import pyplot as plt\n- import numpy\n- can_stats = True\n-except ImportError:\n- can_stats = False\n-\n-import vcf\n-\n-from phe.variant_filters import IUPAC_CODES\n-\n-\n-def plot_stats(pos_stats, total_samples, plots_dir="plots", discarded={}):\n- if not os.path.exists(plots_dir):\n- os.makedirs(plots_dir)\n-\n- for contig in pos_stats:\n-\n- plt.style.use(\'ggplot\')\n-\n- x = numpy.array([pos for pos in pos_stats[contig] if pos not in discarded.get(contig, [])])\n- y = numpy.array([ float(pos_stats[contig][pos]["mut"]) / total_samples for pos in pos_stats[contig] if pos not in discarded.get(contig, []) ])\n-\n- f, (ax1, ax2, ax3, ax4) = plt.subplots(4, sharex=True, sharey=True)\n- f.set_size_inches(12, 15)\n- ax1.plot(x, y, \'ro\')\n- ax1.set_title("Fraction of samples with SNPs")\n- plt.ylim(0, 1.1)\n-\n- y = numpy.array([ float(pos_stats[contig][pos]["N"]) / total_samples for pos in pos_stats[contig] if pos not in discarded.get(contig, [])])\n- ax2.plot(x, y, \'bo\')\n- ax2.set_title("Fraction of samples with Ns")\n-\n- y = numpy.array([ float(pos_stats[contig][pos]["mix"]) / total_samples for pos in pos_stats[contig] if pos not in discarded.get(contig, [])])\n- ax3.plot(x, y, \'go\')\n- ax3.set_title("Fraction of samples with mixed bases")\n-\n- y = numpy.array([ float(pos_stats[contig][pos]["gap"]) / total_samples for pos in pos_stats[contig] if pos not in discarded.get(contig, [])])\n- ax4.plot(x, y, \'yo\')\n- ax4.set_title("Fraction of samples with uncallable genotype (gap)")\n-\n- plt.savefig(os.path.join(plots_dir, "%s.png" % contig), dpi=100)\n-\n-def get_mixture(record, threshold):\n- mixtures = {}\n- try:\n- if len(record.samples[0].data.AD) > 1:\n-\n- total_depth = sum(record.samples[0].data.AD)\n- # Go over all combinations of touples.\n- for comb in itertools.combinations(range(0, len(record.samples[0].data.AD)), 2):\n- i = comb[0]\n- j = comb[1]\n-\n- alleles = list()\n-\n- if 0 in comb:\n- alleles.append(str(record.REF))\n-\n- if i != 0:\n- alleles.append(str(record.ALT[i - 1]))\n- mixture = record.samples[0].data.AD[i]\n- if j != 0:\n- alleles.append(str(record.ALT[j - 1]))\n- mixture = record.samples[0].data.AD[j]\n-\n- ratio = float(mixture) / total_depth\n- if ratio == 1.0:\n- logging.debug("This is only designed for mixtures! %s %s %s %s", record, ratio, record.samples[0].data.AD, record.FILTER)\n-\n- if ratio not in mixtures:\n- mixtures[ratio] = []\n- mixtures[ratio].append(alleles.pop())\n-\n- elif ratio >= threshold:\n- try:\n- code = IUPAC_CODES[frozenset(alleles)]\n- if ratio not in mixtures:\n- mixtures[ratio] = []\n- mixtures[ratio].append(code)\n- except KeyError:\n- logging.warn("Could not retrieve IUPAC code for %s from %s", alleles, record)\n- except AttributeError:\n- mixtures = {}\n-\n- return mixtures\n-\n-def print_stats(stats, pos_stats, total_vars):\n- for contig in stats:\n- for sample, info in stats[contig].items():\n- print "%s,%i,%i" % (sample, len(info.get("n_pos", [])), total_vars)\n-\n'..b'n_ratio = float(len(sample_stats[contig][sample]["n_pos"])) / len(avail_pos[contig])\n- if sample_n_ratio > args.sample_Ns:\n- for pos in sample_stats[contig][sample]["n_pos"]:\n- pos_stats[contig][pos]["N"] -= 1\n-\n- logging.info("Removing %s due to high Ns in sample: %s", sample , sample_n_ratio)\n-\n- delete_samples.append(sample)\n-\n- samples = [sample for sample in samples if sample not in delete_samples]\n- snp_positions = []\n- with open(args.out, "w") as fp:\n-\n- for sample in samples:\n- sample_seq = ""\n- for contig in contigs:\n- if contig in avail_pos:\n- if args.reference:\n- positions = xrange(1, len(args.reference[contig]) + 1)\n- else:\n- positions = avail_pos[contig].keys()\n- for pos in positions:\n- if pos in avail_pos[contig]:\n- if not args.column_Ns or float(pos_stats[contig][pos]["N"]) / len(samples) < args.column_Ns and \\\n- float(pos_stats[contig][pos]["-"]) / len(samples) < args.column_Ns:\n- sample_seq += all_data[contig][sample][pos]\n- else:\n- if contig not in discarded:\n- discarded[contig] = []\n- discarded[contig].append(pos)\n- elif args.reference:\n- sample_seq += args.reference[contig][pos - 1]\n- elif args.reference:\n- sample_seq += args.reference[contig]\n-\n- fp.write(">%s\\n%s\\n" % (sample, sample_seq))\n- # Do the same for reference data.\n- ref_snps = ""\n-\n- for contig in contigs:\n- if contig in avail_pos:\n- if args.reference:\n- positions = xrange(1, len(args.reference[contig]) + 1)\n- else:\n- positions = avail_pos[contig].keys()\n- for pos in positions:\n- if pos in avail_pos[contig]:\n- if not args.column_Ns or float(pos_stats[contig][pos]["N"]) / len(samples) < args.column_Ns and \\\n- float(pos_stats[contig][pos]["-"]) / len(samples) < args.column_Ns:\n-\n- ref_snps += str(avail_pos[contig][pos])\n- snp_positions.append((contig, pos,))\n- elif args.reference:\n- ref_snps += args.reference[contig][pos - 1]\n- elif args.reference:\n- ref_snps += args.reference[contig]\n-\n- fp.write(">reference\\n%s\\n" % ref_snps)\n-\n- if can_stats and args.with_stats:\n- with open(args.with_stats, "wb") as fp:\n- fp.write("contig\\tposition\\tmutations\\tn_frac\\n")\n- for values in snp_positions:\n- fp.write("%s\\t%s\\t%s\\t%s\\n" % (values[0],\n- values[1],\n- float(pos_stats[values[0]][values[1]]["mut"]) / len(args.input),\n- float(pos_stats[values[0]][values[1]]["N"]) / len(args.input)))\n- plot_stats(pos_stats, len(samples), discarded=discarded, plots_dir=os.path.abspath(args.plots_dir))\n- # print_stats(sample_stats, pos_stats, total_vars=len(avail_pos[contig]))\n-\n- total_discarded = 0\n- for _, i in discarded.items():\n- total_discarded += len(i)\n- logging.info("Discarded total of %i poor quality columns", float(total_discarded) / len(args.input))\n- return 0\n-\n-if __name__ == \'__main__\':\n- import time\n-\n-# with PyCallGraph(output=graphviz):\n-# T0 = time.time()\n- r = main()\n-# T1 = time.time()\n-\n-# print "Time taken: %i" % (T1 - T0)\n- exit(r)\n' |
b |
diff -r 7ac17b6d031e -r b09ffe50c378 vcfs2fasta.sh --- a/vcfs2fasta.sh Fri Dec 18 07:31:09 2015 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
[ |
@@ -1,51 +0,0 @@ -#!/bin/bash - -echo $@ - -OUTPUT=$1 -shift -WITHMIXTURES=$1 -shift -COLUMNNS=$1 -shift -SAMPLENS=$1 -shift -REFERENCE=$1 -shift -INCLUDE=$1 -shift -EXCLUDE=$1 -shift -INPUT=$@ - -DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" -export PATH=$PATH:$DIR - -CMD="vcfs2fasta.py --out $OUTPUT --input $INPUT" - -if [ $WITHMIXTURES != "NOTTHERE" ]; then - CMD="$CMD --with-mixtures $WITHMIXTURES" -fi - -if [ $COLUMNNS != "NOTTHERE" ]; then - CMD="$CMD --column-Ns $COLUMNNS" -fi - -if [ $SAMPLENS != "NOTTHERE" ]; then - CMD="$CMD --sample-Ns $SAMPLENS" -fi - -if [ $REFERENCE != "NOTTHERE" ]; then - CMD="$CMD --reference $REFERENCE" -fi - -if [ $INCLUDE != "NOTTHERE" ]; then - CMD="$CMD --include INCLUDE" -fi - -if [ $EXCLUDE != "NOTTHERE" ]; then - CMD="$CMD --exclude EXCLUDE" -fi - -echo $CMD -eval $CMD |
b |
diff -r 7ac17b6d031e -r b09ffe50c378 vcfs2fasta.xml --- a/vcfs2fasta.xml Fri Dec 18 07:31:09 2015 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 |
b |
@@ -1,123 +0,0 @@ -<tool id="vcfs2fasta" name="VCFs to fasta" version="1.0"> - <description>Takes a set of VCF files and outputs a multi fasta file with only the variant positions.</description> - <requirements> - <requirement type="package" version="2.7.10">python</requirement> - <requirement type="package" version="0.6.8dev">pyvcf</requirement> - <requirement type="package" version="3.11">pyyaml</requirement> - <requirement type="package" version="2.0.2">bintrees</requirement> - <requirement type="package" version="1.66">biopython</requirement> - <requirement type="package" version="1.4">matplotlib</requirement> - </requirements> - <stdio> - <!-- Assume anything other than zero is an error --> - <exit_code range="1:" /> - <exit_code range=":-1" /> - </stdio> - <command interpreter="bash"> - vcfs2fasta.sh - $output - #if str($mix_cond.mix) == "yes": - $mix_cond.mix_value - #else - NOTTHERE - #end if - #if str($cols_cond.cols) == "yes": - $cols_cond.column_ns - #else - NOTTHERE - #end if - #if str($sample_cond.sample) == "yes": - $sample_cond.sample_ns - #else - NOTTHERE - #end if - #if str($reference_cond.reference) == "yes": - $reference_cond.ref_fa - #else - NOTTHERE - #end if - #if str($include_cond.include) == "yes": - $include_cond.in_bed - #else - NOTTHERE - #end if - #if str($exclude_cond.exclude) == "yes": - $exclude_cond.ex_bed - #else - NOTTHERE - #end if - #for $i, $input_vcf in enumerate( $input_vcfs ): - "${input_vcf}" - #end for - </command> - - <inputs> - <param name="input_vcfs" type="data" multiple="true" format="vcf" label="Input VCF file(s)" /> - <conditional name="mix_cond"> - <param name="mix" type="select" label="With Mixtures" help="Specify this option with a threshold to output mixtures above this threshold."> - <option value="yes">Specify</option> - <option value="no" selected="true">Do not specify</option> - </param> - <when value="yes"> - <param name="mix_value" type="float" value="0.5" label="Mixture value" /> - </when> - </conditional> - <conditional name="cols_cond"> - <param name="cols" type="select" label="Column Ns" help="Keeps columns with fraction of Ns above specified threshold."> - <option value="yes">Specify</option> - <option value="no" selected="true">Do not specify</option> - </param> - <when value="yes"> - <param name="column_ns" type="float" value="0.5" label="Column Ns value" /> - </when> - </conditional> - <conditional name="sample_cond"> - <param name="sample" type="select" label="Sample Ns" help="Keeps samples with fraction of Ns above specified threshold."> - <option value="yes">Specify</option> - <option value="no" selected="true">Do not specify</option> - </param> - <when value="yes"> - <param name="sample_ns" type="float" value="0.5" label="Sample Ns value" /> - </when> - </conditional> - <conditional name="reference_cond"> - <param name="reference" type="select" label="Reference genome file" help="If path to reference specified, then whole genome will be outputted."> - <option value="yes">Specify</option> - <option value="no" selected="true">Do not specify</option> - </param> - <when value="yes"> - <param name="ref_fa" type="data" format="fasta" label="Reference fasta file" help="Fasta format"/> - </when> - </conditional> - <conditional name="include_cond"> - <param name="include" type="select" label="Include region" help="Specify regions to include in a bed file."> - <option value="yes">Specify</option> - <option value="no" selected="true">Do not specify</option> - </param> - <when value="yes"> - <param name="in_bed" type="data" format="bed" label="Include regions bed file" help="bed format"/> - </when> - </conditional> - <conditional name="exclude_cond"> - <param name="exclude" type="select" label="Exclude region" help="Specify regions to exclude in a bed file."> - <option value="yes">Specify</option> - <option value="no" selected="true">Do not specify</option> - </param> - <when value="yes"> - <param name="ex_bed" type="data" format="bed" label="Exclude regions bed file" help="bed format"/> - </when> - </conditional> - </inputs> - - <outputs> - <data format="fasta" name="output" label="${tool.name} on ${on_string}: FASTA file" /> - </outputs> - <test> - <param name="input_vcfs" value="1_short.vcf" ftype="vcf" /> - <param name="input_vcfs" value="2_short.vcf" ftype="vcf" /> - <output name="output" file="testresult.fa" ftype="fasta" /> - </test> - <help> - - </help> -</tool> |