Galaxy | Tool Preview

Glimmer3 (version 0.2)
Overlaps this short or shorter are ignored.
If the in-frame score >= N, then the region is given a number and considered a potential gene.
If 0.0 specified, the GC percentage will be counted from the input file.
Using this option will produce more short gene predictions.

What it does

This is the main program that makes gene preditions based on an interpolated context model (ICM).

The ICM can be generated with extracted CDS from related organisms (ICM builder). If you can't generate an ICM model you can use the non knowlegde-based Glimmer with a de novo prediction.


Example

Input:

- interpolated context model (ICM): Use the 'Glimmer ICM builder' tool to create one
- Genome Sequence in FASTA format

        >CELF22B7  C.aenorhabditis elegans (Bristol N2) cosmid F22B7
        GATCCTTGTAGATTTTGAATTTGAAGTTTTTTCTCATTCCAAAACTCTGT
        GATCTGAAATAAAATGTCTCAAAAAAATAGAAGAAAACATTGCTTTATAT
        TTATCAGTTATGGTTTTCAAAATTTTCTGACATACCGTTTTGCTTCTTTT
        TTTCTCATCTTCTTCAAATATCAATTGTGATAATCTGACTCCTAACAATC
        GAATTTCTTTTCCTTTTTCTTTTTCCAACAACTCCAGTGAGAACTTTTGA
        ATATCTTCAAGTGACTTCACCACATCAGAAGGTGTCAACGATCTTGTGAG
        AACATCGAATGAAGATAATTTTAATTTTAGAGTTACAGTTTTTCCTCCGA
        CAATTCCTGATTTACGAACATCTTCTTCAAGCATTCTACAGATTTCTTGA
        TGCTCTTCTAGGAGGATGTTGAAATCCGAAGTTGGAGAAAAAGTTCTCTC
        AACTGAAATGCTTTTTCTTCGTGGATCCGATTCAGATGGACGACCTGGCA
        GTCCGAGAGCCGTTCGAAGGAAAGATTCTTGTGAGAGAGGCGTGAAACAC
        AAAGGGTATAGGTTCTTCTTCAGATTCATATCACCAACAGTTTGAATATC
        CATTGCTTTCAGTTGAGCTTCGCATACACGACCAATTCCTCCAACCTAAA
        AAATTATCTAGGTAAAACTAGAAGGTTATGCTTTAATAGTCTCACCTTAC
        GAATCGGTAAATCCTTCAAAAACTCCATAATCGCGTTTTTATCATTTTCT
        .....

Output:

- FASTA file with predicted proteins
- Glimmer prediction file (optional)

        >CELF22B7  C.aenorhabditis elegans (Bristol N2) cosmid F22B7.
        orf00001    40137       52  +2     8.68
        orf00004      603       34  -1     2.91
        orf00006     1289     1095  -3     3.16
        orf00007     1555     1391  -2     2.33
        orf00008     1809     1576  -1     1.02
        orf00010     1953     2066  +3     3.09
        orf00011     2182     2304  +1     0.89
        orf00013     2390     2521  +2     0.60
        orf00018     2570     3073  +2     2.54
        orf00020     3196     3747  +1     2.91
        orf00022     3758     4000  +2     0.83
        orf00023     4399     4157  -2     1.31
        orf00025     4463     4759  +2     2.92
        orf00026     4878     5111  +3     0.78
        orf00027     5468     5166  -3     1.64
        orf00029     5590     5832  +1     0.29
        orf00032     6023     6226  +2     6.02
        orf00033     6217     6336  +1     3.09
        ........

- Glimmer detailed report (optional)

        >CELF22B7  C.aenorhabditis elegans (Bristol N2) cosmid F22B7.
        Sequence length = 40222

                   ----- Start -----           --- Length ----  ------------- Scores -------------
         ID  Frame   of Orf  of Gene     Stop   of Orf of Gene      Raw InFrm F1 F2 F3 R1 R2 R3 NC
        0001    +2    40137    40137       52      135     135     9.26    96  - 96  -  -  3  -  0
        0002    +1       58       64      180      120     114     5.01    69 69  -  - 30  -  -  0
                +3      300      309      422      120     111    -0.68    20  -  - 20 38  -  - 41
                +3      423      432      545      120     111     1.29    21  - 51 21 13  -  8  5
        0003    +2      401      416      595      192     177     2.51    93  - 93  -  5  -  -  1
        0004    -1      645      552       34      609     516     2.33    99  -  -  - 99  -  -  0
                +1      562      592      762      198     168    -2.54     1  1  -  -  -  -  - 98
                +1      763      772      915      150     141    -1.34     1  1  -  -  -  - 86 11
                +3      837      846     1007      168     159     1.35    28  - 50 28  -  - 17  3
        0005    -3     1073      977      654      417     321     0.52    84  -  -  -  -  - 84 15
        0006    -3     1373     1319     1095      276     222     3.80    99  -  -  -  -  - 99  0
        0007    -2     1585     1555     1391      192     162     2.70    98  -  -  -  - 98  -  1
        0008    -1     1812     1809     1576      234     231     1.26    94  -  -  - 94  -  -  5
        0009    +2     1721     1730     1945      222     213     0.68    80  - 80  -  -  -  - 19
        .....

References

A.L. Delcher, K.A. Bratke, E.C. Powers, and S.L. Salzberg. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics (Advance online version) (2007).