identify long, non-overlapping ORFs
macros.xml
&1
]]>
* Extract * ===> * build-icm * ===> * glimmer3 *
* * * * * * * *
************** ************** ************** **************
-----
**Example**
* input::
-Genome Sequence
CELF22B7 C.aenorhabditis elegans (Bristol N2) cosmid F22B7
GATCCTTGTAGATTTTGAATTTGAAGTTTTTTCTCATTCCAAAACTCTGT
GATCTGAAATAAAATGTCTCAAAAAAATAGAAGAAAACATTGCTTTATAT
TTATCAGTTATGGTTTTCAAAATTTTCTGACATACCGTTTTGCTTCTTTT
TTTCTCATCTTCTTCAAATATCAATTGTGATAATCTGACTCCTAACAATC
GAATTTCTTTTCCTTTTTCTTTTTCCAACAACTCCAGTGAGAACTTTTGA
ATATCTTCAAGTGACTTCACCACATCAGAAGGTGTCAACGATCTTGTGAG
AACATCGAATGAAGATAATTTTAATTTTAGAGTTACAGTTTTTCCTCCGA
CAATTCCTGATTTACGAACATCTTCTTCAAGCATTCTACAGATTTCTTGA
TGCTCTTCTAGGAGGATGTTGAAATCCGAAGTTGGAGAAAAAGTTCTCTC
AACTGAAATGCTTTTTCTTCGTGGATCCGATTCAGATGGACGACCTGGCA
GTCCGAGAGCCGTTCGAAGGAAAGATTCTTGTGAGAGAGGCGTGAAACAC
AAAGGGTATAGGTTCTTCTTCAGATTCATATCACCAACAGTTTGAATATC
CATTGCTTTCAGTTGAGCTTCGCATACACGACCAATTCCTCCAACCTAAA
AAATTATCTAGGTAAAACTAGAAGGTTATGCTTTAATAGTCTCACCTTAC
GAATCGGTAAATCCTTCAAAAACTCCATAATCGCGTTTTTATCATTTTCT
.....
- Cutoff 1.5
* output::
Sequence file = /home/mohammed/galaxy-central/database/files/000/dataset_34.dat
Excluded regions file = none
Circular genome = true
Initial minimum gene length = 90 bp
Determine optimal min gene length to maximize number of genes
Maximum overlap bases = 30
Start codons = atg,gtg,ttg
Stop codons = taa,tag,tga
Sequence length = 40222
Final minimum gene length = 97
Putative Genes:
00001 40137 52 +2 0.892
00002 1319 1095 -3 0.654
00003 1555 1391 -2 0.793
00004 1953 2066 +3 1.078
00005 2045 2146 +2 0.919
00006 4463 4759 +2 0.985
00007 6785 6582 -3 1.033
00008 6862 7020 +1 0.915
00009 7300 7488 +1 0.900
00010 7463 7570 +2 0.912
00011 8399 8527 +2 1.044
00012 10652 10545 -3 0.895
00013 12170 12066 -3 1.108
00014 13891 13748 -2 0.998
00015 14157 14044 -1 1.026
00016 15285 15410 +3 0.928
00017 15829 15704 -2 0.949
....
]]>