Galaxy | Tool Preview

GBK-to-GFF (version 2.1.0)
GenBank flat file format consists of an annotation section and a sequence section.

What it does

This tool converts data from a GenBank flat file format to GFF (scroll down for format description).


Example


Reference

GBK-to-GFF is part of oqtans package and cited as [1].

[1]Sreedharan VT, Schultheiss SJ, Jean G et.al., Oqtans: the RNA-seq workbench in the cloud for complete and reproducible quantitative transcriptome analysis. Bioinformatics (2014). 10.1093/bioinformatics/btt731

About file formats

GenBank format An example of a GenBank record may be viewed here

GFF Generic Feature Format is a format for describing genes and other features associated with DNA, RNA and Protein sequences. GFF lines have nine tab-separated fields:

1. seqid - Must be a chromosome or scaffold or contig.
2. source - The program that generated this feature.
3. type - The name of this type of feature. Some examples of standard feature types are "gene", "CDS", "protein", "mRNA", and "exon".
4. start - The starting position of the feature in the sequence. The first base is numbered 1.
5. stop - The ending position of the feature (inclusive).
6. score - A score between 0 and 1000. If there is no score value, enter ".".
7. strand - Valid entries include '+', '-', or '.' (for don't know/care).
8. phase - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the first base. If the feature is not a coding exon, the value should be '.'.
9. attributes - All lines with the same group are linked together into a single item.

Copyright

GBK-to-GFF Wrapper Version 0.6 (Apr 2015)

2009-2015 Max Planck Society, University of Tübingen & Memorial Sloan Kettering Cancer Center