comparison glimmer_long_orfs.xml @ 0:a4136c1534be draft

planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/glimmer commit 37388949e348d221170659bbee547bf4ac67ef1a
author bgruening
date Tue, 28 Nov 2017 10:11:42 -0500
parents
children 8a7c41685ebe
comparison
equal deleted inserted replaced
-1:000000000000 0:a4136c1534be
1 <tool id="glimmer_long_orfs" name="Glimmer long ORFs" version="@WRAPPER_VERSION@">
2 <description>identify long, non-overlapping ORFs</description>
3 <macros>
4 <import>macros.xml</import>
5 </macros>
6 <expand macro="requirements"/>
7 <command><![CDATA[
8 long-orfs
9 -n -t
10 $cutoff
11 '$inputfile'
12 '$output'
13 2>&1
14 ]]></command>
15 <inputs>
16 <param name="inputfile" type="data" format="fasta" label="Genome Sequence" help="Dataset missing? See TIP below"/>
17 <param name='cutoff' type='float' label='cutoff' value='1.5'/>
18 </inputs>
19 <outputs>
20 <data format="tabular" name="output" />
21 </outputs>
22 <tests>
23 <test>
24 <param name="inputfile" value='streptomyces_Tu6071_genomic.fasta'/>
25 <param name='cutoff' value='1.5'/>
26 <output name="output" file='longORFSTestOutput.dat'/>
27 </test>
28 </tests>
29 <help><![CDATA[
30
31 **What it does**
32
33 This program identifies long, non-overlapping open reading frames (orfs) in a DNA sequence file.
34 These orfs are very likely to contain genes, and can be used as a set of training sequences
35 More specifically, among all orfs longer than a minimum length , those that do not overlap any others are output. The start codon used for
36 each orf is the first possible one. The program, by default, automatically determines the
37 value that maximizes the number of orfs that are output. With the -t option, the initial
38 set of candidate orfs also can be filtered using entropy distance, which generally produces
39 a larger, more accurate training set, particularly for high-GC-content genomes.
40
41
42
43 -----
44
45 **Glimmer Overview**
46
47 ::
48
49 ************** ************** ************** **************
50 * * * * * * * *
51 * long-orfs * ===> * Extract * ===> * build-icm * ===> * glimmer3 *
52 * * * * * * * *
53 ************** ************** ************** **************
54
55 -----
56
57 **Example**
58
59
60 * input::
61
62 -Genome Sequence
63
64 CELF22B7 C.aenorhabditis elegans (Bristol N2) cosmid F22B7
65 GATCCTTGTAGATTTTGAATTTGAAGTTTTTTCTCATTCCAAAACTCTGT
66 GATCTGAAATAAAATGTCTCAAAAAAATAGAAGAAAACATTGCTTTATAT
67 TTATCAGTTATGGTTTTCAAAATTTTCTGACATACCGTTTTGCTTCTTTT
68 TTTCTCATCTTCTTCAAATATCAATTGTGATAATCTGACTCCTAACAATC
69 GAATTTCTTTTCCTTTTTCTTTTTCCAACAACTCCAGTGAGAACTTTTGA
70 ATATCTTCAAGTGACTTCACCACATCAGAAGGTGTCAACGATCTTGTGAG
71 AACATCGAATGAAGATAATTTTAATTTTAGAGTTACAGTTTTTCCTCCGA
72 CAATTCCTGATTTACGAACATCTTCTTCAAGCATTCTACAGATTTCTTGA
73 TGCTCTTCTAGGAGGATGTTGAAATCCGAAGTTGGAGAAAAAGTTCTCTC
74 AACTGAAATGCTTTTTCTTCGTGGATCCGATTCAGATGGACGACCTGGCA
75 GTCCGAGAGCCGTTCGAAGGAAAGATTCTTGTGAGAGAGGCGTGAAACAC
76 AAAGGGTATAGGTTCTTCTTCAGATTCATATCACCAACAGTTTGAATATC
77 CATTGCTTTCAGTTGAGCTTCGCATACACGACCAATTCCTCCAACCTAAA
78 AAATTATCTAGGTAAAACTAGAAGGTTATGCTTTAATAGTCTCACCTTAC
79 GAATCGGTAAATCCTTCAAAAACTCCATAATCGCGTTTTTATCATTTTCT
80 .....
81
82 - Cutoff 1.5
83
84 * output::
85
86 Sequence file = /home/mohammed/galaxy-central/database/files/000/dataset_34.dat
87 Excluded regions file = none
88 Circular genome = true
89 Initial minimum gene length = 90 bp
90 Determine optimal min gene length to maximize number of genes
91 Maximum overlap bases = 30
92 Start codons = atg,gtg,ttg
93 Stop codons = taa,tag,tga
94 Sequence length = 40222
95 Final minimum gene length = 97
96
97 Putative Genes:
98 00001 40137 52 +2 0.892
99 00002 1319 1095 -3 0.654
100 00003 1555 1391 -2 0.793
101 00004 1953 2066 +3 1.078
102 00005 2045 2146 +2 0.919
103 00006 4463 4759 +2 0.985
104 00007 6785 6582 -3 1.033
105 00008 6862 7020 +1 0.915
106 00009 7300 7488 +1 0.900
107 00010 7463 7570 +2 0.912
108 00011 8399 8527 +2 1.044
109 00012 10652 10545 -3 0.895
110 00013 12170 12066 -3 1.108
111 00014 13891 13748 -2 0.998
112 00015 14157 14044 -1 1.026
113 00016 15285 15410 +3 0.928
114 00017 15829 15704 -2 0.949
115
116 ....
117 ]]></help>
118 <expand macro="citation" />
119 </tool>