virannot/LICENSE virannot/ virannot/tool_data/virannot_blastdb.loc virannot/tool_data/virannot_hmmdb.loc virannot/tool_data/virannot_rfamdb.loc virannot/tool_data_table_conf.xml virannot/ virannot/wrapper.xml
wrapper.xml
diff -r 779a817f6b10 -r f8941b34bb96 virannot/LICENSE
diff -r 779a817f6b10 -r f8941b34bb96 virannot/
b'@@ -0,0 +1,197 @@\n+# VirAnnot\n+De novo automatic Viral Annotation\n+\n+VirAnnot is a script written in Python 2.7 that annotates viral genomes automatically (using a de novo algorithm) and predict the function of their proteins using BLAST and HMMER.\n+\n+## REQUIREMENTS:\n+\n+Before using this script, the following Python modules and programs should be installed:\n+\n+* Python modules:\n+\t- BCBio (\n+\t- Biopython (Bio module; Cock et al. 2009)\n+\n+* Programs:\n+\t- GNU Parallel (Tange 2011): it is used to parallelize HMMER. The program is publicly available at under the GPLv3 licence.\n+\t- LASTZ (Harris 2007): it is used to predict the circularity of the contigs. The program is publicly available at under the MIT licence.\n+\t- Prodigal (Hyatt et al. 2010): it is used to predict the ORFs. When the contig is smaller than 20,000 bp, MetaProdigal (Hyatt et al. 2012) is automatically activated instead of normal Prodigal. This program is publicly available at under the GPLv3 licence.\n+\t- BLAST+ (Camacho et al. 2008): it is used to predict the function of the predicted proteins according to homology. This suite is publicly available at under the GPLv2 licence. Databases are available at\n+\t- HMMER (Finn et al. 2011): it is used to predict the function of the predicted proteins according to Hidden Markov Models. This suite is publicly available at under the GPLv3 licence. Databases must be in FASTA format and examples of potential databases are UniProtKB ( or PFAM (\n+\t- INFERNAL (Nawrocki and Eddy 2013): it is used to predict ribosomal RNA in the contigs when using the RFAM database (Nawrocki et al. 2015). This program is publicly available at under the BSD licence and RFAM database is available at\n+\t- ARAGORN (Laslett and Canback 2004): it is used to predict tRNA sequences in the contig. This program is publicly available at under the GPLv2 licence.\n+\t- PILERCR (Edgar 2007): it is used to predict CRISPR repeats in your contig. This program is freely available at under a public licence.\n+\t- Tandem Repeats Finder (TRF; Benson 1999): it is used to predict the tandem repeats in your contig. This program is freely available at under a custom licence.\n+\t- Inverted Repeats Finder (IRF; Warburton et al. 2004): it is used to predict the inverted repeats in your contig. This program is freely available at under a custom licence.\n+\n+Although you can install the programs manually, we strongly recommend the use of the Docker image to create an environment for virannot. The link to the Docker image is\n+\n+However, you will need to download the databases for BLAST, HHMER and INFERNAL:\n+* BLAST DBs:\n+* RFAM (INFERNAL):\n+* UniProtKB (HMMER):\n+* PFAM (HMMER):\n+\n+Note that this bioinformatic pipeline only takes protein databases (i.e. "nr", "swissprot"...)!\n+\n+When using this program, you must to cite their use:\n+\n+<In construction>\n+\n+## PARAMETERS:\n+\n+The program has the following two kind of arguments:\n+\n+### Mandatory parameters:\n+\n+<table>\n+<tr><td>--input FASTAFILE</td><td>Inp'..b'nly the parallelization of HMMER and to run BLAST using multiple threads. \n+* v 0.6.1 - Fixed issue with parallel HMMER (the program tend to take all available CPUs independently of the parsed arguments) and with the BLAST/HMMER decision trees (typos).\n+* v 0.6.0 - Replaced HHSUITE by HMMER 3.1 to predict protein function according to Hidden Markov Models. In a recent benchmark (as well as internal ones), we found that HHPred tends to be the slowest program to predict protein function (compared with PHMMER and BLASTP). Additionally, HMMER had a high accuracy when proteins are annotated (Saripella et al. 2016). Moreover, it has the advantage that the databases must be in FASTA format (such UniProt and, even, PFAM), which it is a standard format. For all these reasons, we replaced HHSUITE by HMMER 3.1. Additionally, fixed small issues related with the Genbank file (omission of the contig topology as well as the name of the locus).\n+* v 0.5.0 - Implemented PILERCR to predict CRISPR repeats regions. Additionally, fixed errors in the rRNA prediction and inverted and tandem repeats.\n+* v 0.4.0 - Replaced RNAmmer v 1.2. by INFERNAL 1.1 + RFAM to predict rRNA in the contigs. In this case, you must to specify where you have downloaded the RFAM database using the "--rfamdb" option.\n+* v 0.3.0 - Implemented RNAmmer v 1.2 to predict rRNA in the contigs. If such program is able to predict ribosomal genes, a warning is printed (as viral sequences do not have ribosomal genes).\n+* v 0.2.0 - Added parallelization of BLAST and HHSUITE. To do that, GNU Parallel (Tange 2011) is required. To disable this option, run the program with the "--noparallel" option.\n+* v 0.1.0 - Original version of the program.\n+\n+## REFERENCES:\n+\n+\t- Benson G (2008) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Research 27, 573\xe2\x80\x9380.\n+\t- Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL (2008) BLAST+: architecture and applications. BMC Bioinformatics 10: 421.\n+\t- Edgar RC (2007) PILER-CR: fast and accurate identification of CRISPR repeats. BMC Bioinformatics 8:18.\n+\t- Finn RD, Clements J, Eddy SR (2011) HMMER web server: interactive sequence similarity searching. Nucleic Acids Research 39: W29-37.\n+\t- Fozo EM, Makarova KS, Shabalina SA, Yutin N, Koonin EV, Storz G (2010) Abundance of type I toxin-antitoxin systems in bacteria: searches for new candidates and discovery of novel families. Nucleic Acids Research 38: 3743-59.\n+\t- Harris RS (2007) Improved pairwise alignment of genomic DNA. Ph.D. Thesis, The Pennsylvania State University. \n+\t- Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ (2010) Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11: 119.\n+\t- Hyatt D, Locascio PF, Hauser LJ, Uberbacher EC (2012) Gene and translation initiation site prediction in metagenomic sequences. Bioinformatics 28: 2223-30.\n+\t- Laslett D, Canback B (2004) ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Research 32, 11\xe2\x80\x9316.\n+\t- Nawrocki EP, Eddy SR (2013) Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29: 2933-35.\n+\t- Nawrocki EP, Burge SW, Bateman A, Daub J, Eberhardt RY, Eddy SR, Floden EW, Gardner PP, Jones TA, Tate J, Finn RD (2013) Rfam 12.0: updates to the RNA families database. Nucleic Acids Research 43: D130-7.\n+\t- Saripella GV, Sonnhammer EL, Forslund K (2016) Benchmarking the next generation of homology inference tools. Bioinformatics 32: 2636-41.\n+\t- Seemann T (2014) Prokka: rapid prokaryote genome annotation. Bioinformatics 30: 2068-9.\n+\t- Tange O (2011) GNU Parallel - The Command-Line Power Tool. ;login: The USENIX Magazine 36:42-7.\n+\t- Warburton PE, Giordano J, Cheung F, Gelfand Y, Benson G (2004) Inverted repeat structure of the human genome: The X-chromosome contains a preponderance of large, highly homologous inverted repeats that contain testes genes. Genome Research 14: 1861-9.\n' |
diff -r 779a817f6b10 -r f8941b34bb96 virannot/tool_data/virannot_blastdb.loc
@@ -0,0 +1,31 @@ +# virannot_blastdb.loc +# This is a *.loc.sample file distributed with Galaxy that enables tools +# to use a directory of indexed data files. This one is for Bowtie2 and Tophat2. +# See the wiki: +# First create these data files and save them in your own data directory structure. +# Then, create a bowtie_indices.loc file to use those indexes with tools. +# Copy this file, save it with the same name (minus the .sample), +# follow the format examples, and store the result in this directory. +# The file should include an one line entry for each index set. +# The path points to the "basename" for the set, not a specific file. +# It has four text columns seperated by TABS. +# +# <unique_build_id> <dbkey> <display_name> <file_base_path> +# +# So, for example, if you had hg18 indexes stored in: +# +# /data/databases/blast/nr/nr +# +# containing hg19 genome and hg19.*.bt2 files, such as: +# -rw-rw-r-- 1 james james 914M Feb 10 18:56 hg19canon.fa +# -rw-rw-r-- 1 james james 914M Feb 10 18:56 hg19canon.1.bt2 +# -rw-rw-r-- 1 james james 683M Feb 10 18:56 hg19canon.2.bt2 +# -rw-rw-r-- 1 james james 3.3K Feb 10 16:54 hg19canon.3.bt2 +# -rw-rw-r-- 1 james james 683M Feb 10 16:54 hg19canon.4.bt2 +# -rw-rw-r-- 1 james james 914M Feb 10 20:45 hg19canon.rev.1.bt2 +# -rw-rw-r-- 1 james james 683M Feb 10 20:45 hg19canon.rev.2.bt2 +# +# then the virannot_blastdb.loc entry could look like this: + +nr nr Non_redundant (nr) /data/databases/blast/nr/nr +swissprot swissprot Swissprot (swissprot) /data/databases/blast/swissprot/swissprot |
diff -r 779a817f6b10 -r f8941b34bb96 virannot/tool_data/virannot_hmmdb.loc
@@ -0,0 +1,5 @@ +# <unique_build_id> <dbkey> <display_name> <file_base_path> +# +uniprot_trembl uniprot_trembl UniProt TrEMBL /data/databases/SwissProt_UniProt/uniprot_trembl.fasta +uniprot_sprot uniprot_sprot UniProt Swiss-Prot /data/databases/SwissProt_UniProt/uniprot_sprot.fasta + |
diff -r 779a817f6b10 -r f8941b34bb96 virannot/tool_data/virannot_rfamdb.loc
@@ -0,0 +1,4 @@ +# <unique_build_id> <dbkey> <display_name> <file_base_path> +# +Rfam Rfam Rfam /data/databases/rfam/ + |
diff -r 779a817f6b10 -r f8941b34bb96 virannot/tool_data_table_conf.xml
@@ -0,0 +1,15 @@ +<!-- Use the file tool_data_table_conf.xml.oldlocstyle if you don't want to update your loc files as changed in revision 4550:535d276c92bc--> +<tables> + <table name="virannot_blastdb" comment_char="#"> + <columns>value, dbkey, name, path</columns> + <file path="tool-data/virannot_blastdb.loc" /> + </table> + <table name="virannot_rfamdb" comment_char="#"> + <columns>value, dbkey, name, path</columns> + <file path="tool-data/virannot_rfamdb.loc" /> + </table> + <table name="virannot_phmmdb" comment_char="#"> + <columns>value, dbkey, name, path</columns> + <file path="tool-data/virannot_phmmdb.loc" /> + </table> +</tables> |
diff -r 779a817f6b10 -r f8941b34bb96 virannot/
b'@@ -0,0 +1,803 @@\n+#!/usr/bin/env python\n+\n+# -*- coding: utf-8 -*-\n+\n+# Virannot - De-novo viral genome annotator\n+#\n+# Copyright (C) 2017 - Enrique Gonzalez-Tortuero\n+# Vimalkumar Velayudhan\n+#\n+# This program is free software: you can redistribute it and/or modify\n+# it under the terms of the GNU General Public License as published by\n+# the Free Software Foundation, either version 3 of the License, or\n+# (at your option) any later version.\n+#\n+# This program is distributed in the hope that it will be useful,\n+# but WITHOUT ANY WARRANTY; without even the implied warranty of\n+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n+# GNU General Public License for more details.\n+#\n+# You should have received a copy of the GNU General Public License\n+# along with this program. If not, see <>.\n+\n+# Importing python libraries\n+from __future__ import print_function\n+import argparse\n+import csv\n+import fileinput\n+import fractions\n+import glob\n+import os\n+import re\n+import sys\n+import subprocess\n+from BCBio import GFF\n+from Bio import SeqIO\n+from Bio import SeqFeature\n+from Bio.Alphabet import IUPAC\n+from Bio.Seq import Seq\n+from Bio.SeqFeature import FeatureLocation\n+from Bio.SeqRecord import SeqRecord\n+from Bio.SeqUtils.ProtParam import ProteinAnalysis\n+from collections import OrderedDict, defaultdict\n+from time import strftime\n+\n+# Preparing functions\n+def batch_iterator(iterator, batch_size):\n+\tentry = True\n+\twhile entry:\n+\t\tbatch = []\n+\t\twhile len(batch) < batch_size:\n+\t\t\ttry:\n+\t\t\t\tentry =\n+\t\t\texcept StopIteration:\n+\t\t\t\tentry = None\n+\t\t\tif entry is None:\n+\t\t\t\tbreak\n+\t\t\tbatch.append(entry)\n+\t\tif batch:\n+\t\t\tyield batch\n+\n+def cmd_exists(cmd):\n+\treturn"type " + cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) == 0\n+\n+def eprint(*args, **kwargs):\n+\tprint(*args, file=sys.stderr, **kwargs)\n+\n+#def find(name, path):\n+#\tfor root, dirs, files in os.walk(path):\n+#\t\tif name in files:\n+#\t\t\treturn os.path.join(root, name)\n+\n+def stringSplitByNumbers(x):\n+\tr = re.compile(\'(\\d+)\')\n+\tl = r.split(x)\n+\treturn [int(y) if y.isdigit() else y for y in l]\n+\n+# Defining the program version\n+version = "0.7.1"\n+\n+# Processing the parameters\n+parser = argparse.ArgumentParser(description=\'Virannot is a automatic de novo viral genome annotator.\')\n+basic_group = parser.add_argument_group(\'Basic options for virannot [REQUIRED]\')\n+\n+basic_group.add_argument("--input", dest="inputfile", type=str, required=True, help=\'Input file as a FASTA file\', metavar="FASTAFILE")\n+basic_group.add_argument("--blastdb", dest="blastdatabase", type=str, required=True, help=\'BLAST Database that will be used for the protein function prediction. The database must be an amino acid one, not nucleotidic\', metavar="BLASTDB")\n+basic_group.add_argument("--rfamdb", dest="rfamdatabase", type=str, required=True, help=\'RFAM Database that will be used for the ribosomal RNA prediction. RFAMDB should be in the format "/full/path/to/rfamdb/" and must be compressed accordingly (see INFERNAL manual) before running the script.\', metavar="RFAMDB")\n+basic_group.add_argument("--modifiers", dest="modifiers", type=str, required=True, help=\'Input file as a plain text file with the modifiers per every FASTA header according to SeqIn ( All modifiers must be written in a single line and are separated by a single space character. No space should be placed besides the = sign. For example: [organism=Serratia marcescens subsp. marcescens] [sub-species=marcescens] [strain=AH0650_Sm1] [topology=linear] [moltype=DNA] [tech=wgs] [gcode=11] [country=Australia] [isolation-source=sputum]. This line will be copied and printed along with the record name as the definition line of every contig sequence.\', metavar="TEXTFILE")\n+\n+advanced_group = parser.add_argument_group(\'Advanced options for virannot [OPTIONAL]\')\n+advanced_group.add_argument("--readlength'..b' % newfile\n+\twith open(newtempgbk, "rU") as gbktempfile, open(newgbk, "w") as gbkrealfile:\n+\t\tnewpat = re.compile("D|RNA\\s+(CON|PHG|VRL|BCT)")\n+\t\tfor line in gbktempfile:\n+\t\t\tif line.startswith("LOCUS ") and, line):\n+\t\t\t\tif genomeshape[\'genomeshape\'] == "linear":\n+\t\t\t\t\tnewline = re.sub("bp DNA\\s+", "bp DNA linear ", line)\n+\t\t\t\telse:\n+\t\t\t\t\tnewline = re.sub("bp DNA\\s+", "bp DNA circular ", line)\n+\t\t\t\tgbkrealfile.write(newline)\n+\t\t\telse:\n+\t\t\t\tgbkrealfile.write(line)\n+\n+\tfor f in glob.glob("*.temp.gbk"):\n+\t\tos.remove(f)\n+\n+\tif args.gffprint==True:\n+\t\tnewgff = "%s.gff" % newfile\n+\t\twith open(newgff, "w") as outgff, open(newgbk, "rU") as ingbk:\n+\t\t\tGFF.write(SeqIO.parse(ingbk, "genbank"), outgff)\n+\n+\t# Removing intermediate files\n+\tos.remove(newfile)\n+\tos.remove("temporal_circular.fasta")\n+\tos.remove("temp.faa")\n+\tos.remove("temp_blast.csv")\n+\tos.remove("crisprfile.txt")\n+\tos.remove("trnafile.fasta")\n+\tos.remove("rrnafile.csv")\n+\tos.remove("trf_temp.dat")\n+\tos.remove("irf_temp.dat")\n+\tfor f in glob.glob("SEQ*"):\n+\t\tos.remove(f)\n+\n+# Joining all GENBANK files into one\n+listgbk = sorted(glob.glob(\'CONTIG_*.gbk\'))\n+gbkoutputfile = "%s.gbk" % root_output\n+with open(gbkoutputfile, \'w\') as finalgbk:\n+\tfor fname in listgbk:\n+\t\twith open(fname) as infile:\n+\t\t\tfor line in infile:\n+\t\t\t\tfinalgbk.write(line)\n+\n+for tempgbk in glob.glob("CONTIG_*.gbk"):\n+\tos.remove(tempgbk)\n+\n+# Joining all GFF files into one\n+if args.gffprint==True:\n+\tlistgff = sorted(glob.glob(\'CONTIG_*.gff\'))\n+\tgffoutputfile = "%s.gff" % root_output\n+\twith open(gffoutputfile, \'w\') as finalgff:\n+\t\tfor fname in listgff:\n+\t\t\twith open(fname) as infile:\n+\t\t\t\tfor line in infile:\n+\t\t\t\t\tfinalgff.write(line)\n+\tfor tempgff in glob.glob("CONTIG_*.gff"):\n+\t\tos.remove(tempgff)\n+\n+# Joining all TABLE files into one\n+listcsv = sorted(glob.glob(\'CONTIG_*.csv\'))\n+tbloutputfile = "%s.csv" % root_output\n+with open(tbloutputfile, \'w\') as finaltable:\n+\tfor fname in listcsv:\n+\t\twith open(fname) as infile:\n+\t\t\tfor line in infile:\n+\t\t\t\tfinaltable.write(line)\n+\n+for temptbl in glob.glob("CONTIG_*.csv"):\n+\tos.remove(temptbl)\n+\n+# Preparing sequences for GenBank submission (Original code from Wan Yu\'s script [])\n+allowed_qualifiers = [\'locus_tag\', \'gene\', \'product\', \'pseudo\', \'protein_id\', \'gene_desc\', \'old_locus_tag\', \'note\', \'inference\', \'organism\', \'mol_type\', \'strain\', \'sub_species\', \'isolation-source\', \'country\']\n+newfastafile = "%s.fasta" % root_output\n+newtablefile = "%s.tbl" % root_output\n+with open(args.modifiers, "rU") as modifiers, open(gbkoutputfile, "r") as genbank_fh, open(newfastafile, "w") as fasta_fh, open(newtablefile, "w") as feature_fh: \n+\tinfo = modifiers.readline()\n+\twholelist = list(SeqIO.parse(genbank_fh, \'genbank\'))\n+\tfor record in wholelist:\n+\t\tif len(record) <= args.mincontigsize:\n+\t\t\teprint("WARNING: Skipping small contig %s" %\n+\t\t\tcontinue\n+\t\trecord.description = "%s %s" % (, info)\n+\t\tSeqIO.write([record], fasta_fh, \'fasta\')\n+\t\tprint(\'>Feature %s\' % (, file=feature_fh)\n+\t\tfor line in record.features:\n+\t\t\tif line.strand == 1:\n+\t\t\t\tprint(\'%d\\t%d\\t%s\' % (line.location.nofuzzy_start + 1, line.location.nofuzzy_end, line.type), file=feature_fh)\n+\t\t\telse:\n+\t\t\t\tprint(\'%d\\t%d\\t%s\' % (line.location.nofuzzy_end, line.location.nofuzzy_start + 1, line.type), file=feature_fh)\n+\t\t\tfor (key, values) in line.qualifiers.iteritems():\n+\t\t\t\tif key not in allowed_qualifiers:\n+\t\t\t\t\tcontinue\n+\t\t\t\tfor v in values:\n+\t\t\t\t\tprint(\'\\t\\t\\t%s\\t%s\' % (key, v), file=feature_fh)\n+\n+# Final statement\n+eprint("Genome annotation done!")\n+eprint("The GenBank file is %s" % gbkoutputfile)\n+if args.gffprint==True:\n+\teprint("The GFF3 file is %s" % gffoutputfile)\n+eprint("The table file for GenBank submission is %s" % tbloutputfile)\n+eprint("The FASTA file for GenBank submission is %s" % newfastafile)\n+eprint("The table file with all protein information is %s" % newtablefile)\n' |
diff -r 779a817f6b10 -r f8941b34bb96 virannot/wrapper.xml
b'@@ -0,0 +1,258 @@\n+<tool id="virannot" name="virannot" version="0.7.1">\n+ <description>de novo viral genome annotator</description>\n+ <requirements>\n+ <container type="docker">vimalkvn/virannot</container>\n+ </requirements>\n+ <stdio>\n+ <exit_code range="1:" />\n+ </stdio>\n+ <command><![CDATA[\n+python $__tool_directory__/\n+--input $input\n+--blastdb $blastdb.fields.path\n+--rfamdb $rfamdb.fields.path\n+--modifiers $modifiers\n+--threads \\${GALAXY_SLOTS:-5}\n+--typedata $typedata_select\n+--gcode $gcode_select\n+--out "default"\n+--minrepeat $minrepeat\n+--maxrepeat $maxrepeat\n+--minspacer $minspacer\n+--maxspacer $maxspacer\n+\n+#if $readlength\n+ --readlength $readlength\n+#end if\n+#if $locus\n+ --locus $locus\n+#end if\n+#if $gffprint\n+ --gff\n+#end if\n+#if str($blastevalue)\n+ --blastevalue $blastevalue\n+#end if\n+#if str($mincontigsize)\n+ --mincontigsize $mincontigsize\n+#end if\n+#if str($idthr)\n+ --idthr $idthr\n+#end if\n+#if str($coverthr)\n+ --coverthr $coverthr\n+#end if\n+#if str($diffid)\n+ --diffid $diffid\n+#end if\n+#if $blastexh\n+ --blastexh\n+#end if\n+#if str($method.use_phmmer) == "yes"\n+--hmmdb $method.hmmdb.fields.path\n+--hmmerevalue $method.hmmerevalue\n+#else:\n+--fast\n+#end if\n+]]></command>\n+ <inputs>\n+ <param name="input" type="data" format="fasta" label="(Viral) contigs to annotate" help="Input file as a FASTA file. It can contain multiple sequences (e.g. metagenomic contigs)" />\n+ <param name="blastdb" type="select" label="BLAST Database" help="BLAST Protein Database that will be used for the protein function prediction">\n+ <options from_data_table="virannot_blastdb">\n+ <filter type="sort_by" column="2"/>\n+ <validator type="no_options" message="No indexes are available for the selected input dataset"/>\n+ </options>\n+ </param>\n+ <param name="rfamdb" type="select" label="RFAM database used for ribosomal RNA prediction">\n+ <options from_data_table="virannot_rfamdb">\n+ <filter type="sort_by" column="2"/>\n+ <validator type="no_options" message="No indexes are available for the selected input dataset"/>\n+ </options>\n+ </param>\n+ <param name="modifiers" type="data" format="txt" label="Metadata of the contigs" help="Modifiers per every FASTA header according to SeqIn (" />\n+\t<param name="typedata_select" type="select" label="GenBank division (--typedata)">\n+\t <option value="CON" selected="true">Contig</option>\n+\t <option value="PHG">Phages</option>\n+\t <option value="VRL">Eukaryotic/Archaea virus</option>\n+\t <option value="BCT">Prokaryotic chromosome</option>\n+\t</param>\n+\t<param name="gcode_select" type="select" label="Number of GenBank translation table (--gcode)">\n+\t <option value="1">Standard genetic code [Eukaryotic]</option>\n+\t <option value="2">Vertebrate mitochondrial code</option>\n+\t <option value="3">Yeast mitochondrial code</option>\n+\t <option value="4">Mycoplasma/Spiroplasma and Protozoan/mold/coelenterate mitochondrial code</option>\n+\t <option value="5">Invertebrate mitochondrial code</option>\n+\t <option value="6">Ciliate, dasycladacean and hexamita nuclear code</option>\n+\t <option value="9">Echinoderm/flatworm mitochondrial code</option>\n+\t <option value="10">Euplotid nuclear code</option>\n+\t <option value="11" selected="true">Bacteria/Archaea/Phages/Plant plastid</option>\n+\t <option value="12">Alternative yeast nuclear code</option>\n+\t <option value="13">Ascidian mitochondrial code</option>\n+\t <option value="14">Alternative flatworm mitochondrial code</option>\n+\t <option value="16">Chlorophycean mitochondrial code</option>\n+\t <option value="21">Trematode mitochondrial code</option>\n+\t <option value="22">Scedenesmus obliquus mitochondrial code</option>\n+\t <option value="23">Thraustochytrium mitochondrial code</option>\n+\t <option value="'..b'mat="txt" label="${} on ${on_string}: tbl" from_work_dir="default.tbl" />\n+ </outputs>\n+ <tests>\n+ <test>\n+ <param name="input" ftype="fasta" value="rubella.fasta" />\n+ <param name="outputs" value="csv,gbk,fasta,tbl" />\n+ <output name="default_csv" file="default.csv" />\n+ <output name="default_gbk" file="default.gbk" />\n+ <output name="default_fasta" file="default.fasta" />\n+ <output name="default_tbl" file="default.tbl" />\n+ </test>\n+ <test>\n+ <param name="input" ftype="fasta" value="mu.fasta" />\n+ <param name="outputs" value="csv,gbk,fasta,tbl" />\n+ <output name="default_csv" file="default.csv" />\n+ <output name="default_gbk" file="default.gbk" />\n+ <output name="default_fasta" file="default.fasta" />\n+ <output name="default_tbl" file="default.tbl" />\n+ </test>\n+ </tests>\n+ <help><![CDATA[\n+**About VirAnnot**\n+\n+VirAnnot_ is a script written in Python 2.7 that annotates viral\n+genomes automatically (using a de novo algorithm) and predict the\n+function of their proteins using BLAST and HMMER.\n+\n+----\n+\n+**About this Galaxy wrapper**\n+\n+*Installation*\n+\n+#. `Docker <>`_ should first be installed and\n+ working on the server where this Galaxy instance is setup. The\n+ **galaxy** user should be part of the **docker** group.\n+\n+#. Download or clone the VirAnnot_ Github repository (as a submodule)\n+ in to the ``tools`` directory.\n+\n+#. Update ``config/tool_conf.xml`` like this::\n+\n+ <section id="annotation" name="Annotation">\n+ <tool file="virannot/wrapper.xml" />\n+ </section>\n+\n+#. Update ``config/tool_data_table_conf.xml`` to add location of loc\n+ files::\n+\n+ <!-- virannot databases -->\n+ <table name="virannot_blastdb" comment_char="#">\n+ <columns>value, dbkey, name, path</columns>\n+ <file path="tool-data/virannot_blastdb.loc" />\n+ </table>\n+ <table name="virannot_rfamdb" comment_char="#">\n+ <columns>value, dbkey, name, path</columns>\n+ <file path="tool-data/virannot_rfamdb.loc" />\n+ </table>\n+ <table name="virannot_hmmdb" comment_char="#">\n+ <columns>value, dbkey, name, path</columns>\n+ <file path="tool-data/virannot_hmmdb.loc" />\n+ </table>\n+\n+#. Copy ``.loc`` files from ``virannot/tool-data`` to\n+ ``galaxy/tool-data`` and update the database paths within those\n+ files.\n+\n+#. Restart Galaxy.\n+ \n+----\n+\n+**Output files**\n+\n+VirAnnot creates the following output files:\n+\n+* tbl - Table file with all protein information.\n+* gbk - GenBank format file with annotations.\n+* fasta - FASTA format file for GenBank submission\n+* csv - Table file for GenBank submission.\n+* gff - GFF3 format file (if option is selected)\n+\n+----\n+\n+**License and citation**\n+\n+VirAnnot_ and this Galaxy wrapper - `GPLv3 <>`_.\n+\n+\n+Galaxy\n+\n+- Goecks, J, Nekrutenko, A, Taylor, J and The Galaxy Team. "Galaxy: a\n+ comprehensive approach for supporting accessible, reproducible, and\n+ transparent computational research in the life sciences." \n+ Genome Biol. 2010 Aug 25;11(8):R86.\n+\n+- Blankenberg D, Von Kuster G, Coraor N, Ananda G, Lazarus R, Mangan M,\n+ Nekrutenko A, Taylor J. "Galaxy: a web-based genome analysis tool for\n+ experimentalists". Current Protocols in Molecular Biology. \n+ 2010 Jan; Chapter 19:Unit 19.10.1-21.\n+\n+- Giardine B, Riemer C, Hardison RC, Burhans R, Elnitski L, Shah P, Zhang Y,\n+ Blankenberg D, Albert I, Taylor J, Miller W, Kent WJ, Nekrutenko A. "Galaxy:\n+ a platform for interactive large-scale genome analysis." \n+ Genome Research. 2005 Oct; 15(10):1451-5.\n+\n+You can use this tool only if you agree to the license terms of: `virannot`_.\n+\n+.. _VirAnnot:\n+\n+]]></help>\n+<!-- <citations>\n+ <citation type="doi">NOT YET</citation>\n+ </citations>\n+-->\n+</tool>\n' |
diff -r 779a817f6b10 -r f8941b34bb96 wrapper.xml
