Translate transcripts from the input BED file into protein sequences.
The genomic sequence:
- may be supplied in an extra column in the BED input file
- retrieved from a twobit genomic reference file
- retrieved from the Ensembl REST API for Ensembl transcripts
INPUTS
- BED file with at least the standard 12 columns
- Genome reference in twobit format (optional)
OUTPUTS
- FASTA of transcript translations
- BED with the genomic location of the translated protein. The added 13th column contains the protein sequence.
OPTIONS
- Feature translation
- cDNA - three frame translations of the cDNA sequences with an output for each sequence between STOP codons
- CDS - three frame translations of CDS (coding sequence defined by thickStart and thickEnd in the BED file)
- Translation filtering
- can be trimmed to a Methionine start codon
- can be split into peptides by an enzyme digestion
- must exceed specified minimum length
- BED Filtering
- genomic regions
- ensembl biotype if the BED contains the 20 columns as retrieved from the Ensembl REST API