view glimmerHMM/glimmerhmm_predict.xml @ 0:0a15677c6668 default tip

Uploaded
author bjoern-gruening
date Wed, 11 Jan 2012 09:58:35 -0500
parents
children
line wrap: on
line source

<tool id="glimmerhmm_predict" name="GlimmerHMM" version="0.1">
	<description>Predict ORFs in eukaryotic genomes</description>
	<command>glimmerhmm 
	    $input /home/galaxy/lib/glimmer_trainings/train_crypto 
	    -o $output 
	    -g 
	    $svm_splice_prediction 
	    $partial_gene
	    #if str($top_n_predictions) != "-1":
	        -n $top_n_predictions
        #end if
	    2> /dev/null</command>
	<inputs>
		<param name="input" type="data" format="fasta" label="Genome Sequence"/>
		<param name="partial_gene" type="boolean" label="Don't make partial gene predictions" truevalue="-f" falsevalue="" checked="false" />
		<param name="svm_splice_prediction" type="boolean" label="Don't use svm splice site predictions" truevalue="-v" falsevalue="" checked="false" />
		<param name="top_n_predictions" type="integer" label="top n best predictions, -1 means infinite" value="-1"/>
	</inputs>
	<outputs>
        <data format="tabular" name="output">
            <change_format>
                <when input="gff" value="-g" format="gff" />
            </change_format>
        </data>
    </outputs>
	<help>

**What it does**

GlimmerHMM is a new gene finder based on a Generalized Hidden Markov Model (GHMM).
Although the gene finder conforms to the overall mathematical framework of a GHMM,
additionally it incorporates splice site models adapted from the GeneSplicer program and a
decision tree adapted from GlimmerM. It also utilizes Interpolated Markov Models for the
coding and noncoding models . Currently, GlimmerHMM's GHMM structure includes introns of each phase,
intergenic regions, and four types of exons (initial, internal, final, and single).
A basic user manual can be consulted here.

-----	

**Example**

Suppose you have the following DNA formatted sequences::

    >SQ   Sequence 8667507 BP; 1203558 A; 3121252 C; 3129638 G; 1213059 T; 0 other;
    cccgcggagcgggtaccacatcgctgcgcgatgtgcgagcgaacacccgggctgcgcccg
    ggtgttgcgctcccgctccgcgggagcgctggcgggacgctgcgcgtcccgctcaccaag
    cccgcttcgcgggcttggtgacgctccgtccgctgcgcttccggagttgcggggcttcgc
    cccgctaaccctgggcctcgcttcgctccgccttgggcctgcggcgggtccgctgcgctc
    ccccgcctcaagggcccttccggctgcgcctccaggacccaaccgcttgcgcgggcctgg

Running this tool will produce this::

    ##gff-version 3
    ##sequence-region ConsensusfromCH236920mapping 1 4148552
    ConsensusfromCH236920mapping  GlimmerHMM  mRNA  1       122     .   +   .   ID=ConsensusfromCH236920mapping.path1.gene1;Name=ConsensusfromCH236920mapping.path1.gene1
    ConsensusfromCH236920mapping  GlimmerHMM  CDS   1       122     .   +   0   ID=ConsensusfromCH236920mapping.cds1.1;
    ConsensusfromCH236920mapping  GlimmerHMM  mRNA  14066   15205   .   -   .   ID=ConsensusfromCH236920mapping.path1.gene2;Name=ConsensusfromCH236920mapping.path1.gene2
    ConsensusfromCH236920mapping  GlimmerHMM  CDS   14066   15034   .   -   0   ID=ConsensusfromCH236920mapping.cds2.1;
    ConsensusfromCH236920mapping  GlimmerHMM  CDS   15137   15205   .   -   0   ID=ConsensusfromCH236920mapping.cds2.2;
    ConsensusfromCH236920mapping  GlimmerHMM  mRNA  19910   24210   .   -   .   ID=ConsensusfromCH236920mapping.path1.gene3;Name=ConsensusfromCH236920mapping.path1.gene3



	</help>
</tool>