Mercurial > repos > bgruening > glimmer_long_orfs
comparison glimmer_long_orfs.xml @ 0:a4136c1534be draft
planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/glimmer commit 37388949e348d221170659bbee547bf4ac67ef1a
author | bgruening |
---|---|
date | Tue, 28 Nov 2017 10:11:42 -0500 |
parents | |
children | 8a7c41685ebe |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:a4136c1534be |
---|---|
1 <tool id="glimmer_long_orfs" name="Glimmer long ORFs" version="@WRAPPER_VERSION@"> | |
2 <description>identify long, non-overlapping ORFs</description> | |
3 <macros> | |
4 <import>macros.xml</import> | |
5 </macros> | |
6 <expand macro="requirements"/> | |
7 <command><![CDATA[ | |
8 long-orfs | |
9 -n -t | |
10 $cutoff | |
11 '$inputfile' | |
12 '$output' | |
13 2>&1 | |
14 ]]></command> | |
15 <inputs> | |
16 <param name="inputfile" type="data" format="fasta" label="Genome Sequence" help="Dataset missing? See TIP below"/> | |
17 <param name='cutoff' type='float' label='cutoff' value='1.5'/> | |
18 </inputs> | |
19 <outputs> | |
20 <data format="tabular" name="output" /> | |
21 </outputs> | |
22 <tests> | |
23 <test> | |
24 <param name="inputfile" value='streptomyces_Tu6071_genomic.fasta'/> | |
25 <param name='cutoff' value='1.5'/> | |
26 <output name="output" file='longORFSTestOutput.dat'/> | |
27 </test> | |
28 </tests> | |
29 <help><![CDATA[ | |
30 | |
31 **What it does** | |
32 | |
33 This program identifies long, non-overlapping open reading frames (orfs) in a DNA sequence file. | |
34 These orfs are very likely to contain genes, and can be used as a set of training sequences | |
35 More specifically, among all orfs longer than a minimum length , those that do not overlap any others are output. The start codon used for | |
36 each orf is the first possible one. The program, by default, automatically determines the | |
37 value that maximizes the number of orfs that are output. With the -t option, the initial | |
38 set of candidate orfs also can be filtered using entropy distance, which generally produces | |
39 a larger, more accurate training set, particularly for high-GC-content genomes. | |
40 | |
41 | |
42 | |
43 ----- | |
44 | |
45 **Glimmer Overview** | |
46 | |
47 :: | |
48 | |
49 ************** ************** ************** ************** | |
50 * * * * * * * * | |
51 * long-orfs * ===> * Extract * ===> * build-icm * ===> * glimmer3 * | |
52 * * * * * * * * | |
53 ************** ************** ************** ************** | |
54 | |
55 ----- | |
56 | |
57 **Example** | |
58 | |
59 | |
60 * input:: | |
61 | |
62 -Genome Sequence | |
63 | |
64 CELF22B7 C.aenorhabditis elegans (Bristol N2) cosmid F22B7 | |
65 GATCCTTGTAGATTTTGAATTTGAAGTTTTTTCTCATTCCAAAACTCTGT | |
66 GATCTGAAATAAAATGTCTCAAAAAAATAGAAGAAAACATTGCTTTATAT | |
67 TTATCAGTTATGGTTTTCAAAATTTTCTGACATACCGTTTTGCTTCTTTT | |
68 TTTCTCATCTTCTTCAAATATCAATTGTGATAATCTGACTCCTAACAATC | |
69 GAATTTCTTTTCCTTTTTCTTTTTCCAACAACTCCAGTGAGAACTTTTGA | |
70 ATATCTTCAAGTGACTTCACCACATCAGAAGGTGTCAACGATCTTGTGAG | |
71 AACATCGAATGAAGATAATTTTAATTTTAGAGTTACAGTTTTTCCTCCGA | |
72 CAATTCCTGATTTACGAACATCTTCTTCAAGCATTCTACAGATTTCTTGA | |
73 TGCTCTTCTAGGAGGATGTTGAAATCCGAAGTTGGAGAAAAAGTTCTCTC | |
74 AACTGAAATGCTTTTTCTTCGTGGATCCGATTCAGATGGACGACCTGGCA | |
75 GTCCGAGAGCCGTTCGAAGGAAAGATTCTTGTGAGAGAGGCGTGAAACAC | |
76 AAAGGGTATAGGTTCTTCTTCAGATTCATATCACCAACAGTTTGAATATC | |
77 CATTGCTTTCAGTTGAGCTTCGCATACACGACCAATTCCTCCAACCTAAA | |
78 AAATTATCTAGGTAAAACTAGAAGGTTATGCTTTAATAGTCTCACCTTAC | |
79 GAATCGGTAAATCCTTCAAAAACTCCATAATCGCGTTTTTATCATTTTCT | |
80 ..... | |
81 | |
82 - Cutoff 1.5 | |
83 | |
84 * output:: | |
85 | |
86 Sequence file = /home/mohammed/galaxy-central/database/files/000/dataset_34.dat | |
87 Excluded regions file = none | |
88 Circular genome = true | |
89 Initial minimum gene length = 90 bp | |
90 Determine optimal min gene length to maximize number of genes | |
91 Maximum overlap bases = 30 | |
92 Start codons = atg,gtg,ttg | |
93 Stop codons = taa,tag,tga | |
94 Sequence length = 40222 | |
95 Final minimum gene length = 97 | |
96 | |
97 Putative Genes: | |
98 00001 40137 52 +2 0.892 | |
99 00002 1319 1095 -3 0.654 | |
100 00003 1555 1391 -2 0.793 | |
101 00004 1953 2066 +3 1.078 | |
102 00005 2045 2146 +2 0.919 | |
103 00006 4463 4759 +2 0.985 | |
104 00007 6785 6582 -3 1.033 | |
105 00008 6862 7020 +1 0.915 | |
106 00009 7300 7488 +1 0.900 | |
107 00010 7463 7570 +2 0.912 | |
108 00011 8399 8527 +2 1.044 | |
109 00012 10652 10545 -3 0.895 | |
110 00013 12170 12066 -3 1.108 | |
111 00014 13891 13748 -2 0.998 | |
112 00015 14157 14044 -1 1.026 | |
113 00016 15285 15410 +3 0.928 | |
114 00017 15829 15704 -2 0.949 | |
115 | |
116 .... | |
117 ]]></help> | |
118 <expand macro="citation" /> | |
119 </tool> |