comparison test-data/meme_output_test2.txt @ 13:57e5d9382f36 draft

planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/meme commit e2cf796f991cbe8c96e0cc5a0056b7255ac3ad6b
author iuc
date Thu, 17 May 2018 14:10:48 -0400
parents
children 3f0dd362b755
comparison
equal deleted inserted replaced
12:5585f04eb317 13:57e5d9382f36
1 ********************************************************************************
2 MEME - Motif discovery tool
3 ********************************************************************************
4 MEME version 4.12.0 (Release date: Tue Jun 27 16:22:50 2017 -0700)
5
6 For further information on how to interpret these results or to get
7 a copy of the MEME software please access http://meme-suite.org .
8
9 This file may be used as input to the MAST algorithm for searching
10 sequence databases for matches to groups of motifs. MAST is available
11 for interactive use and downloading at http://meme-suite.org .
12 ********************************************************************************
13
14
15 ********************************************************************************
16 REFERENCE
17 ********************************************************************************
18 If you use this program in your research, please cite:
19
20 Timothy L. Bailey and Charles Elkan,
21 "Fitting a mixture model by expectation maximization to discover
22 motifs in biopolymers", Proceedings of the Second International
23 Conference on Intelligent Systems for Molecular Biology, pp. 28-36,
24 AAAI Press, Menlo Park, California, 1994.
25 ********************************************************************************
26
27
28 ********************************************************************************
29 TRAINING SET
30 ********************************************************************************
31 DATAFILE= Galaxy_FASTA_Input
32 ALPHABET= ACGT
33 Sequence name Weight Length Sequence name Weight Length
34 ------------- ------ ------ ------------- ------ ------
35 chr21_19617074_19617124_ 1.0000 50 chr21_26934381_26934431_ 1.0000 50
36 chr21_28217753_28217803_ 1.0000 50 chr21_31710037_31710087_ 1.0000 50
37 chr21_31744582_31744632_ 1.0000 50 chr21_31768316_31768366_ 1.0000 50
38 chr21_31914206_31914256_ 1.0000 50 chr21_31933633_31933683_ 1.0000 50
39 chr21_31962741_31962791_ 1.0000 50 chr21_31964683_31964733_ 1.0000 50
40 chr21_31973364_31973414_ 1.0000 50 chr21_31992870_31992920_ 1.0000 50
41 chr21_32185595_32185645_ 1.0000 50 chr21_32202076_32202126_ 1.0000 50
42 chr21_32253899_32253949_ 1.0000 50 chr21_32410820_32410870_ 1.0000 50
43 chr21_36411748_36411798_ 1.0000 50 chr21_37838750_37838800_ 1.0000 50
44 chr21_45705687_45705737_ 1.0000 50 chr21_45971413_45971463_ 1.0000 50
45 chr21_45978668_45978718_ 1.0000 50 chr21_45993530_45993580_ 1.0000 50
46 chr21_46020421_46020471_ 1.0000 50 chr21_46031920_46031970_ 1.0000 50
47 chr21_46046964_46047014_ 1.0000 50 chr21_46057197_46057247_ 1.0000 50
48 chr21_46086869_46086919_ 1.0000 50 chr21_46102103_46102153_ 1.0000 50
49 chr21_47517957_47518007_ 1.0000 50 chr21_47575506_47575556_ 1.0000 50
50 ********************************************************************************
51
52 ********************************************************************************
53 COMMAND LINE SUMMARY
54 ********************************************************************************
55 This information can also be useful in the event you wish to report a
56 problem with the MEME software.
57
58 command: meme meme_input_1.fasta -o meme_test2_out -nostatus -maxsize 1000000 -sf Galaxy_FASTA_Input -dna -mod zoops -nmotifs 1 -wnsites 0.8 -minw 8 -maxw 50 -wg 11 -ws 1 -maxiter 50 -distance 0.001 -prior dirichlet -b 0.01 -plib prior30.plib -spmap uni -spfuzz 0.5
59
60 model: mod= zoops nmotifs= 1 evt= inf
61 object function= E-value of product of p-values
62 width: minw= 8 maxw= 50
63 width: wg= 11 ws= 1 endgaps= yes
64 nsites: minsites= 2 maxsites= 30 wnsites= 0.8
65 theta: spmap= uni spfuzz= 0.5
66 global: substring= yes branching= no wbranch= no
67 em: prior= dirichlet b= 0.01 maxiter= 50
68 distance= 0.001
69 data: n= 1500 N= 30 shuffle= -1
70 strands: +
71 sample: seed= 0 ctfrac= -1 maxwords= -1
72 Dirichlet mixture priors file: prior30.plib
73 Letter frequencies in dataset:
74 A 0.294 C 0.231 G 0.257 T 0.217
75 Background letter frequencies (from dataset with add-one prior applied):
76 A 0.294 C 0.231 G 0.257 T 0.217
77 ********************************************************************************
78
79
80 ********************************************************************************
81 MOTIF GGSRTATAAAA MEME-1 width = 11 sites = 30 llr = 254 E-value = 5.1e-040
82 ********************************************************************************
83 --------------------------------------------------------------------------------
84 Motif GGSRTATAAAA MEME-1 Description
85 --------------------------------------------------------------------------------
86 Simplified A 3313:9:a798
87 pos.-specific C 1:3::1:::1:
88 probability G 6756::::::2
89 matrix T 1:11a1a:3::
90
91 bits 2.2 *
92 2.0 * *
93 1.8 * *
94 1.5 * ** *
95 Relative 1.3 * ** *
96 Entropy 1.1 ******
97 (12.2 bits) 0.9 * *******
98 0.7 * *******
99 0.4 ** ********
100 0.2 ***********
101 0.0 -----------
102
103 Multilevel GGGGTATAAAA
104 consensus AACA T
105 sequence
106
107 --------------------------------------------------------------------------------
108
109 --------------------------------------------------------------------------------
110 Motif GGSRTATAAAA MEME-1 sites sorted by position p-value
111 --------------------------------------------------------------------------------
112 Sequence name Start P-value Site
113 ------------- ----- --------- -----------
114 chr21_46046964_46047014_ 13 4.51e-07 AAGGCCAGGA GGGGTATAAAA GCCTGAGAGC
115 chr21_46031920_46031970_ 16 2.22e-06 ATACCCAGGG AGGGTATAAAA CCTCAGCAGC
116 chr21_32202076_32202126_ 14 2.74e-06 CCACCAGCTT GAGGTATAAAA AGCCCTGTAC
117 chr21_46057197_46057247_ 37 4.86e-06 ACAGGCCCTG GGCATATAAAA GCC
118 chr21_45993530_45993580_ 8 4.86e-06 CCAAGGA GGAGTATAAAA GCCCCACAAA
119 chr21_45971413_45971463_ 10 4.86e-06 CAGGCCCTG GGCATATAAAA GCCCCAGCAG
120 chr21_31964683_31964733_ 14 4.86e-06 GATTCACTGA GGCATATAAAA GGCCCTCTGC
121 chr21_47517957_47518007_ 33 6.48e-06 CCGGCGGGGC GGGGTATAAAG GGGGCGG
122 chr21_45978668_45978718_ 5 6.48e-06 CAGA GGGGTATAAAG GTTCCGACCA
123 chr21_32185595_32185645_ 19 6.48e-06 CACCAGAGCT GGGATATATAA AGAAGGTTCT
124 chr21_32410820_32410870_ 22 1.38e-05 AATCACTGAG GATGTATAAAA GTCCCAGGGA
125 chr21_31992870_31992920_ 17 1.38e-05 CACTATTGAA GATGTATAAAA TTTCATTTGC
126 chr21_19617074_19617124_ 40 1.41e-05 CCTCGGGACG TGGGTATATAA
127 chr21_31914206_31914256_ 16 1.61e-05 CCCACTACTT AGAGTATAAAA TCATTCTGAG
128 chr21_46020421_46020471_ 3 1.95e-05 GA GACATATAAAA GCCAACATCC
129 chr21_32253899_32253949_ 18 1.95e-05 CCCACCAGCA AGGATATATAA AAGCTCAGGA
130 chr21_45705687_45705737_ 38 2.16e-05 CGTGGTCGCG GGGGTATAACA GC
131 chr21_47575506_47575556_ 31 3.04e-05 GCTGCCGGTG AGCGTATAAAG GCCCTGGCG
132 chr21_31744582_31744632_ 13 3.04e-05 CAGGTCTAAG AGCATATATAA CTTGGAGTCC
133 chr21_31768316_31768366_ 1 3.67e-05 . AACGTATATAA ATGGTCCTGT
134 chr21_26934381_26934431_ 28 3.93e-05 AGTCACAAGT GAGTTATAAAA GGGTCGCACG
135 chr21_31933633_31933683_ 5 5.65e-05 TCAG AGTATATATAA ATGTTCCTGT
136 chr21_31710037_31710087_ 15 6.24e-05 CCCAGGTTTC TGAGTATATAA TCGCCGCACC
137 chr21_36411748_36411798_ 23 7.15e-05 AGTTTCAGTT GGCATCtaaaa attatataac
138 chr21_46102103_46102153_ 37 1.39e-04 TGCCTGGGTC CAGGTATAAAG GCT
139 chr21_46086869_46086919_ 38 1.39e-04 TGCCTGGGCC CAGGTATAAAG GC
140 chr21_37838750_37838800_ 3 4.81e-04 ga tggttttataa ggggcctcac
141 chr21_31962741_31962791_ 14 8.57e-04 TATAACTCAG GTTGGATAAAA TAATTTGTAC
142 chr21_31973364_31973414_ 8 1.47e-03 aaactta aaactctataa acttaaaact
143 chr21_28217753_28217803_ 27 2.64e-03 GGTGGGGGTG GGGGTTTCACT GGTCCACTAT
144 --------------------------------------------------------------------------------
145
146 --------------------------------------------------------------------------------
147 Motif GGSRTATAAAA MEME-1 block diagrams
148 --------------------------------------------------------------------------------
149 SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM
150 ------------- ---------------- -------------
151 chr21_46046964_46047014_ 4.5e-07 12_[+1]_27
152 chr21_46031920_46031970_ 2.2e-06 15_[+1]_24
153 chr21_32202076_32202126_ 2.7e-06 13_[+1]_26
154 chr21_46057197_46057247_ 4.9e-06 36_[+1]_3
155 chr21_45993530_45993580_ 4.9e-06 7_[+1]_32
156 chr21_45971413_45971463_ 4.9e-06 9_[+1]_30
157 chr21_31964683_31964733_ 4.9e-06 13_[+1]_26
158 chr21_47517957_47518007_ 6.5e-06 32_[+1]_7
159 chr21_45978668_45978718_ 6.5e-06 4_[+1]_35
160 chr21_32185595_32185645_ 6.5e-06 18_[+1]_21
161 chr21_32410820_32410870_ 1.4e-05 21_[+1]_18
162 chr21_31992870_31992920_ 1.4e-05 16_[+1]_23
163 chr21_19617074_19617124_ 1.4e-05 39_[+1]
164 chr21_31914206_31914256_ 1.6e-05 15_[+1]_24
165 chr21_46020421_46020471_ 1.9e-05 2_[+1]_37
166 chr21_32253899_32253949_ 1.9e-05 17_[+1]_22
167 chr21_45705687_45705737_ 2.2e-05 37_[+1]_2
168 chr21_47575506_47575556_ 3e-05 30_[+1]_9
169 chr21_31744582_31744632_ 3e-05 12_[+1]_27
170 chr21_31768316_31768366_ 3.7e-05 [+1]_39
171 chr21_26934381_26934431_ 3.9e-05 27_[+1]_12
172 chr21_31933633_31933683_ 5.6e-05 4_[+1]_35
173 chr21_31710037_31710087_ 6.2e-05 14_[+1]_25
174 chr21_36411748_36411798_ 7.1e-05 22_[+1]_17
175 chr21_46102103_46102153_ 0.00014 36_[+1]_3
176 chr21_46086869_46086919_ 0.00014 37_[+1]_2
177 chr21_37838750_37838800_ 0.00048 2_[+1]_37
178 chr21_31962741_31962791_ 0.00086 13_[+1]_26
179 chr21_31973364_31973414_ 0.0015 7_[+1]_32
180 chr21_28217753_28217803_ 0.0026 26_[+1]_13
181 --------------------------------------------------------------------------------
182
183 --------------------------------------------------------------------------------
184 Motif GGSRTATAAAA MEME-1 in BLOCKS format
185 --------------------------------------------------------------------------------
186 BL MOTIF GGSRTATAAAA width=11 seqs=30
187 chr21_46046964_46047014_ ( 13) GGGGTATAAAA 1
188 chr21_46031920_46031970_ ( 16) AGGGTATAAAA 1
189 chr21_32202076_32202126_ ( 14) GAGGTATAAAA 1
190 chr21_46057197_46057247_ ( 37) GGCATATAAAA 1
191 chr21_45993530_45993580_ ( 8) GGAGTATAAAA 1
192 chr21_45971413_45971463_ ( 10) GGCATATAAAA 1
193 chr21_31964683_31964733_ ( 14) GGCATATAAAA 1
194 chr21_47517957_47518007_ ( 33) GGGGTATAAAG 1
195 chr21_45978668_45978718_ ( 5) GGGGTATAAAG 1
196 chr21_32185595_32185645_ ( 19) GGGATATATAA 1
197 chr21_32410820_32410870_ ( 22) GATGTATAAAA 1
198 chr21_31992870_31992920_ ( 17) GATGTATAAAA 1
199 chr21_19617074_19617124_ ( 40) TGGGTATATAA 1
200 chr21_31914206_31914256_ ( 16) AGAGTATAAAA 1
201 chr21_46020421_46020471_ ( 3) GACATATAAAA 1
202 chr21_32253899_32253949_ ( 18) AGGATATATAA 1
203 chr21_45705687_45705737_ ( 38) GGGGTATAACA 1
204 chr21_47575506_47575556_ ( 31) AGCGTATAAAG 1
205 chr21_31744582_31744632_ ( 13) AGCATATATAA 1
206 chr21_31768316_31768366_ ( 1) AACGTATATAA 1
207 chr21_26934381_26934431_ ( 28) GAGTTATAAAA 1
208 chr21_31933633_31933683_ ( 5) AGTATATATAA 1
209 chr21_31710037_31710087_ ( 15) TGAGTATATAA 1
210 chr21_36411748_36411798_ ( 23) GGCATCTAAAA 1
211 chr21_46102103_46102153_ ( 37) CAGGTATAAAG 1
212 chr21_46086869_46086919_ ( 38) CAGGTATAAAG 1
213 chr21_37838750_37838800_ ( 3) TGGTTTTATAA 1
214 chr21_31962741_31962791_ ( 14) GTTGGATAAAA 1
215 chr21_31973364_31973414_ ( 8) AAACTCTATAA 1
216 chr21_28217753_28217803_ ( 27) GGGGTTTCACT 1
217 //
218
219 --------------------------------------------------------------------------------
220
221 --------------------------------------------------------------------------------
222 Motif GGSRTATAAAA MEME-1 position-specific scoring matrix
223 --------------------------------------------------------------------------------
224 log-odds matrix: alength= 4 w= 11 n= 1200 bayes= 5.2854 E= 5.1e-040
225 -14 -179 114 -112
226 3 -1155 137 -270
227 -114 20 86 -71
228 3 -279 122 -170
229 -1155 -1155 -295 215
230 156 -179 -1155 -170
231 -1155 -1155 -1155 220
232 172 -279 -1155 -1155
233 125 -1155 -1155 46
234 167 -179 -1155 -1155
235 144 -1155 -63 -270
236 --------------------------------------------------------------------------------
237
238 --------------------------------------------------------------------------------
239 Motif GGSRTATAAAA MEME-1 position-specific probability matrix
240 --------------------------------------------------------------------------------
241 letter-probability matrix: alength= 4 w= 11 nsites= 30 E= 5.1e-040
242 0.266667 0.066667 0.566667 0.100000
243 0.300000 0.000000 0.666667 0.033333
244 0.133333 0.266667 0.466667 0.133333
245 0.300000 0.033333 0.600000 0.066667
246 0.000000 0.000000 0.033333 0.966667
247 0.866667 0.066667 0.000000 0.066667
248 0.000000 0.000000 0.000000 1.000000
249 0.966667 0.033333 0.000000 0.000000
250 0.700000 0.000000 0.000000 0.300000
251 0.933333 0.066667 0.000000 0.000000
252 0.800000 0.000000 0.166667 0.033333
253 --------------------------------------------------------------------------------
254
255 --------------------------------------------------------------------------------
256 Motif GGSRTATAAAA MEME-1 regular expression
257 --------------------------------------------------------------------------------
258 [GA][GA][GC][GA]TATA[AT]AA
259 --------------------------------------------------------------------------------
260
261
262
263
264 Time 0.38 secs.
265
266 ********************************************************************************
267
268
269 ********************************************************************************
270 SUMMARY OF MOTIFS
271 ********************************************************************************
272
273 --------------------------------------------------------------------------------
274 Combined block diagrams: non-overlapping sites with p-value < 0.0001
275 --------------------------------------------------------------------------------
276 SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM
277 ------------- ---------------- -------------
278 chr21_19617074_19617124_ 5.63e-04 39_[+1(1.41e-05)]
279 chr21_26934381_26934431_ 1.57e-03 27_[+1(3.93e-05)]_12
280 chr21_28217753_28217803_ 1.00e-01 50
281 chr21_31710037_31710087_ 2.49e-03 14_[+1(6.24e-05)]_25
282 chr21_31744582_31744632_ 1.22e-03 12_[+1(3.04e-05)]_27
283 chr21_31768316_31768366_ 1.47e-03 [+1(3.67e-05)]_39
284 chr21_31914206_31914256_ 6.45e-04 15_[+1(1.61e-05)]_24
285 chr21_31933633_31933683_ 2.26e-03 4_[+1(5.65e-05)]_35
286 chr21_31962741_31962791_ 3.37e-02 50
287 chr21_31964683_31964733_ 1.95e-04 13_[+1(4.86e-06)]_26
288 chr21_31973364_31973414_ 5.73e-02 50
289 chr21_31992870_31992920_ 5.52e-04 16_[+1(1.38e-05)]_23
290 chr21_32185595_32185645_ 2.59e-04 18_[+1(6.48e-06)]_21
291 chr21_32202076_32202126_ 1.10e-04 13_[+1(2.74e-06)]_26
292 chr21_32253899_32253949_ 7.78e-04 17_[+1(1.95e-05)]_22
293 chr21_32410820_32410870_ 5.52e-04 21_[+1(1.38e-05)]_18
294 chr21_36411748_36411798_ 2.85e-03 22_[+1(7.15e-05)]_17
295 chr21_37838750_37838800_ 1.90e-02 50
296 chr21_45705687_45705737_ 8.63e-04 37_[+1(2.16e-05)]_2
297 chr21_45971413_45971463_ 1.95e-04 9_[+1(4.86e-06)]_30
298 chr21_45978668_45978718_ 2.59e-04 4_[+1(6.48e-06)]_35
299 chr21_45993530_45993580_ 1.95e-04 7_[+1(4.86e-06)]_32
300 chr21_46020421_46020471_ 7.78e-04 2_[+1(1.95e-05)]_37
301 chr21_46031920_46031970_ 8.89e-05 15_[+1(2.22e-06)]_24
302 chr21_46046964_46047014_ 1.80e-05 12_[+1(4.51e-07)]_27
303 chr21_46057197_46057247_ 1.95e-04 36_[+1(4.86e-06)]_3
304 chr21_46086869_46086919_ 5.54e-03 50
305 chr21_46102103_46102153_ 5.54e-03 50
306 chr21_47517957_47518007_ 2.59e-04 32_[+1(6.48e-06)]_7
307 chr21_47575506_47575556_ 1.22e-03 30_[+1(3.04e-05)]_9
308 --------------------------------------------------------------------------------
309
310 ********************************************************************************
311
312
313 ********************************************************************************
314 Stopped because requested number of motifs (1) found.
315 ********************************************************************************
316
317 CPU: ThinkPad-T450s
318
319 ********************************************************************************