What it does
Bakta is a tool for the rapid & standardized annotation of bacterial genomes and plasmids from both isolates and MAGs.
Input options
Organism options You can specify informations about analysed fasta as text input for: - genus - species - strain - plasmid
Annotation options 1. You can specify if all sequences (chromosome or plasmids) are complete or not 2. You can add your own prodigal training file for CDS predictionœ 3. The translation table could be modified, default is the 11th for bacteria 4. You can specify if bacteria is gram -/+ or unknonw (default value is unknow) 5. You can keep the name of contig present in the input file 6. You can specify your own replicon table as a TSV/CSV file 7. The compliance option is for ready to submit annotation file to Public database as ENA, Genbank EMBL 8. You can specify a protein sequence file for annotation in GenBank or fasta formats Using the Fasta format, each reference sequence can be provided in a short or long format:
# short: >id gene~~~product~~~dbxrefs MAQ...
# long: >id min_identity~~~min_query_cov~~~min_subject_cov~~~gene~~~product~~~dbxrefs MAQ...
Skip steps Some steps could be skiped: - skip-trna Skip tRNA detection & annotation - skip-tmrna Skip tmRNA detection & annotation - skip-rrna Skip rRNA detection & annotation - skip-ncrna Skip ncRNA detection & annotation - skip-ncrna-region Skip ncRNA region detection & annotation - skip-crispr Skip CRISPR array detection & annotation - skip-cds Skip CDS detection & annotation - skip-pseudo Skip pseudogene detection & annotation - skip-sorf Skip sORF detection & annotation - skip-gap Skip gap detection & annotation - skip-ori Skip oriC/oriT detection & annotation
Output options Bakta produce numbers of output files, you can select what type of file you want: - Summary of the annotation - Annotated files - Sequence files for nucleotide and/or amino acid