Mercurial > repos > iuc > meme_meme
view test-data/meme_output_test2.txt @ 13:57e5d9382f36 draft
planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/meme commit e2cf796f991cbe8c96e0cc5a0056b7255ac3ad6b
author | iuc |
---|---|
date | Thu, 17 May 2018 14:10:48 -0400 |
parents | |
children | 3f0dd362b755 |
line wrap: on
line source
******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.12.0 (Release date: Tue Jun 27 16:22:50 2017 -0700) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme-suite.org . This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme-suite.org . ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= Galaxy_FASTA_Input ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ chr21_19617074_19617124_ 1.0000 50 chr21_26934381_26934431_ 1.0000 50 chr21_28217753_28217803_ 1.0000 50 chr21_31710037_31710087_ 1.0000 50 chr21_31744582_31744632_ 1.0000 50 chr21_31768316_31768366_ 1.0000 50 chr21_31914206_31914256_ 1.0000 50 chr21_31933633_31933683_ 1.0000 50 chr21_31962741_31962791_ 1.0000 50 chr21_31964683_31964733_ 1.0000 50 chr21_31973364_31973414_ 1.0000 50 chr21_31992870_31992920_ 1.0000 50 chr21_32185595_32185645_ 1.0000 50 chr21_32202076_32202126_ 1.0000 50 chr21_32253899_32253949_ 1.0000 50 chr21_32410820_32410870_ 1.0000 50 chr21_36411748_36411798_ 1.0000 50 chr21_37838750_37838800_ 1.0000 50 chr21_45705687_45705737_ 1.0000 50 chr21_45971413_45971463_ 1.0000 50 chr21_45978668_45978718_ 1.0000 50 chr21_45993530_45993580_ 1.0000 50 chr21_46020421_46020471_ 1.0000 50 chr21_46031920_46031970_ 1.0000 50 chr21_46046964_46047014_ 1.0000 50 chr21_46057197_46057247_ 1.0000 50 chr21_46086869_46086919_ 1.0000 50 chr21_46102103_46102153_ 1.0000 50 chr21_47517957_47518007_ 1.0000 50 chr21_47575506_47575556_ 1.0000 50 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme meme_input_1.fasta -o meme_test2_out -nostatus -maxsize 1000000 -sf Galaxy_FASTA_Input -dna -mod zoops -nmotifs 1 -wnsites 0.8 -minw 8 -maxw 50 -wg 11 -ws 1 -maxiter 50 -distance 0.001 -prior dirichlet -b 0.01 -plib prior30.plib -spmap uni -spfuzz 0.5 model: mod= zoops nmotifs= 1 evt= inf object function= E-value of product of p-values width: minw= 8 maxw= 50 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 30 wnsites= 0.8 theta: spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 0.001 data: n= 1500 N= 30 shuffle= -1 strands: + sample: seed= 0 ctfrac= -1 maxwords= -1 Dirichlet mixture priors file: prior30.plib Letter frequencies in dataset: A 0.294 C 0.231 G 0.257 T 0.217 Background letter frequencies (from dataset with add-one prior applied): A 0.294 C 0.231 G 0.257 T 0.217 ******************************************************************************** ******************************************************************************** MOTIF GGSRTATAAAA MEME-1 width = 11 sites = 30 llr = 254 E-value = 5.1e-040 ******************************************************************************** -------------------------------------------------------------------------------- Motif GGSRTATAAAA MEME-1 Description -------------------------------------------------------------------------------- Simplified A 3313:9:a798 pos.-specific C 1:3::1:::1: probability G 6756::::::2 matrix T 1:11a1a:3:: bits 2.2 * 2.0 * * 1.8 * * 1.5 * ** * Relative 1.3 * ** * Entropy 1.1 ****** (12.2 bits) 0.9 * ******* 0.7 * ******* 0.4 ** ******** 0.2 *********** 0.0 ----------- Multilevel GGGGTATAAAA consensus AACA T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGSRTATAAAA MEME-1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ----------- chr21_46046964_46047014_ 13 4.51e-07 AAGGCCAGGA GGGGTATAAAA GCCTGAGAGC chr21_46031920_46031970_ 16 2.22e-06 ATACCCAGGG AGGGTATAAAA CCTCAGCAGC chr21_32202076_32202126_ 14 2.74e-06 CCACCAGCTT GAGGTATAAAA AGCCCTGTAC chr21_46057197_46057247_ 37 4.86e-06 ACAGGCCCTG GGCATATAAAA GCC chr21_45993530_45993580_ 8 4.86e-06 CCAAGGA GGAGTATAAAA GCCCCACAAA chr21_45971413_45971463_ 10 4.86e-06 CAGGCCCTG GGCATATAAAA GCCCCAGCAG chr21_31964683_31964733_ 14 4.86e-06 GATTCACTGA GGCATATAAAA GGCCCTCTGC chr21_47517957_47518007_ 33 6.48e-06 CCGGCGGGGC GGGGTATAAAG GGGGCGG chr21_45978668_45978718_ 5 6.48e-06 CAGA GGGGTATAAAG GTTCCGACCA chr21_32185595_32185645_ 19 6.48e-06 CACCAGAGCT GGGATATATAA AGAAGGTTCT chr21_32410820_32410870_ 22 1.38e-05 AATCACTGAG GATGTATAAAA GTCCCAGGGA chr21_31992870_31992920_ 17 1.38e-05 CACTATTGAA GATGTATAAAA TTTCATTTGC chr21_19617074_19617124_ 40 1.41e-05 CCTCGGGACG TGGGTATATAA chr21_31914206_31914256_ 16 1.61e-05 CCCACTACTT AGAGTATAAAA TCATTCTGAG chr21_46020421_46020471_ 3 1.95e-05 GA GACATATAAAA GCCAACATCC chr21_32253899_32253949_ 18 1.95e-05 CCCACCAGCA AGGATATATAA AAGCTCAGGA chr21_45705687_45705737_ 38 2.16e-05 CGTGGTCGCG GGGGTATAACA GC chr21_47575506_47575556_ 31 3.04e-05 GCTGCCGGTG AGCGTATAAAG GCCCTGGCG chr21_31744582_31744632_ 13 3.04e-05 CAGGTCTAAG AGCATATATAA CTTGGAGTCC chr21_31768316_31768366_ 1 3.67e-05 . AACGTATATAA ATGGTCCTGT chr21_26934381_26934431_ 28 3.93e-05 AGTCACAAGT GAGTTATAAAA GGGTCGCACG chr21_31933633_31933683_ 5 5.65e-05 TCAG AGTATATATAA ATGTTCCTGT chr21_31710037_31710087_ 15 6.24e-05 CCCAGGTTTC TGAGTATATAA TCGCCGCACC chr21_36411748_36411798_ 23 7.15e-05 AGTTTCAGTT GGCATCtaaaa attatataac chr21_46102103_46102153_ 37 1.39e-04 TGCCTGGGTC CAGGTATAAAG GCT chr21_46086869_46086919_ 38 1.39e-04 TGCCTGGGCC CAGGTATAAAG GC chr21_37838750_37838800_ 3 4.81e-04 ga tggttttataa ggggcctcac chr21_31962741_31962791_ 14 8.57e-04 TATAACTCAG GTTGGATAAAA TAATTTGTAC chr21_31973364_31973414_ 8 1.47e-03 aaactta aaactctataa acttaaaact chr21_28217753_28217803_ 27 2.64e-03 GGTGGGGGTG GGGGTTTCACT GGTCCACTAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGSRTATAAAA MEME-1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chr21_46046964_46047014_ 4.5e-07 12_[+1]_27 chr21_46031920_46031970_ 2.2e-06 15_[+1]_24 chr21_32202076_32202126_ 2.7e-06 13_[+1]_26 chr21_46057197_46057247_ 4.9e-06 36_[+1]_3 chr21_45993530_45993580_ 4.9e-06 7_[+1]_32 chr21_45971413_45971463_ 4.9e-06 9_[+1]_30 chr21_31964683_31964733_ 4.9e-06 13_[+1]_26 chr21_47517957_47518007_ 6.5e-06 32_[+1]_7 chr21_45978668_45978718_ 6.5e-06 4_[+1]_35 chr21_32185595_32185645_ 6.5e-06 18_[+1]_21 chr21_32410820_32410870_ 1.4e-05 21_[+1]_18 chr21_31992870_31992920_ 1.4e-05 16_[+1]_23 chr21_19617074_19617124_ 1.4e-05 39_[+1] chr21_31914206_31914256_ 1.6e-05 15_[+1]_24 chr21_46020421_46020471_ 1.9e-05 2_[+1]_37 chr21_32253899_32253949_ 1.9e-05 17_[+1]_22 chr21_45705687_45705737_ 2.2e-05 37_[+1]_2 chr21_47575506_47575556_ 3e-05 30_[+1]_9 chr21_31744582_31744632_ 3e-05 12_[+1]_27 chr21_31768316_31768366_ 3.7e-05 [+1]_39 chr21_26934381_26934431_ 3.9e-05 27_[+1]_12 chr21_31933633_31933683_ 5.6e-05 4_[+1]_35 chr21_31710037_31710087_ 6.2e-05 14_[+1]_25 chr21_36411748_36411798_ 7.1e-05 22_[+1]_17 chr21_46102103_46102153_ 0.00014 36_[+1]_3 chr21_46086869_46086919_ 0.00014 37_[+1]_2 chr21_37838750_37838800_ 0.00048 2_[+1]_37 chr21_31962741_31962791_ 0.00086 13_[+1]_26 chr21_31973364_31973414_ 0.0015 7_[+1]_32 chr21_28217753_28217803_ 0.0026 26_[+1]_13 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGSRTATAAAA MEME-1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF GGSRTATAAAA width=11 seqs=30 chr21_46046964_46047014_ ( 13) GGGGTATAAAA 1 chr21_46031920_46031970_ ( 16) AGGGTATAAAA 1 chr21_32202076_32202126_ ( 14) GAGGTATAAAA 1 chr21_46057197_46057247_ ( 37) GGCATATAAAA 1 chr21_45993530_45993580_ ( 8) GGAGTATAAAA 1 chr21_45971413_45971463_ ( 10) GGCATATAAAA 1 chr21_31964683_31964733_ ( 14) GGCATATAAAA 1 chr21_47517957_47518007_ ( 33) GGGGTATAAAG 1 chr21_45978668_45978718_ ( 5) GGGGTATAAAG 1 chr21_32185595_32185645_ ( 19) GGGATATATAA 1 chr21_32410820_32410870_ ( 22) GATGTATAAAA 1 chr21_31992870_31992920_ ( 17) GATGTATAAAA 1 chr21_19617074_19617124_ ( 40) TGGGTATATAA 1 chr21_31914206_31914256_ ( 16) AGAGTATAAAA 1 chr21_46020421_46020471_ ( 3) GACATATAAAA 1 chr21_32253899_32253949_ ( 18) AGGATATATAA 1 chr21_45705687_45705737_ ( 38) GGGGTATAACA 1 chr21_47575506_47575556_ ( 31) AGCGTATAAAG 1 chr21_31744582_31744632_ ( 13) AGCATATATAA 1 chr21_31768316_31768366_ ( 1) AACGTATATAA 1 chr21_26934381_26934431_ ( 28) GAGTTATAAAA 1 chr21_31933633_31933683_ ( 5) AGTATATATAA 1 chr21_31710037_31710087_ ( 15) TGAGTATATAA 1 chr21_36411748_36411798_ ( 23) GGCATCTAAAA 1 chr21_46102103_46102153_ ( 37) CAGGTATAAAG 1 chr21_46086869_46086919_ ( 38) CAGGTATAAAG 1 chr21_37838750_37838800_ ( 3) TGGTTTTATAA 1 chr21_31962741_31962791_ ( 14) GTTGGATAAAA 1 chr21_31973364_31973414_ ( 8) AAACTCTATAA 1 chr21_28217753_28217803_ ( 27) GGGGTTTCACT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGSRTATAAAA MEME-1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 11 n= 1200 bayes= 5.2854 E= 5.1e-040 -14 -179 114 -112 3 -1155 137 -270 -114 20 86 -71 3 -279 122 -170 -1155 -1155 -295 215 156 -179 -1155 -170 -1155 -1155 -1155 220 172 -279 -1155 -1155 125 -1155 -1155 46 167 -179 -1155 -1155 144 -1155 -63 -270 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGSRTATAAAA MEME-1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 11 nsites= 30 E= 5.1e-040 0.266667 0.066667 0.566667 0.100000 0.300000 0.000000 0.666667 0.033333 0.133333 0.266667 0.466667 0.133333 0.300000 0.033333 0.600000 0.066667 0.000000 0.000000 0.033333 0.966667 0.866667 0.066667 0.000000 0.066667 0.000000 0.000000 0.000000 1.000000 0.966667 0.033333 0.000000 0.000000 0.700000 0.000000 0.000000 0.300000 0.933333 0.066667 0.000000 0.000000 0.800000 0.000000 0.166667 0.033333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGSRTATAAAA MEME-1 regular expression -------------------------------------------------------------------------------- [GA][GA][GC][GA]TATA[AT]AA -------------------------------------------------------------------------------- Time 0.38 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- chr21_19617074_19617124_ 5.63e-04 39_[+1(1.41e-05)] chr21_26934381_26934431_ 1.57e-03 27_[+1(3.93e-05)]_12 chr21_28217753_28217803_ 1.00e-01 50 chr21_31710037_31710087_ 2.49e-03 14_[+1(6.24e-05)]_25 chr21_31744582_31744632_ 1.22e-03 12_[+1(3.04e-05)]_27 chr21_31768316_31768366_ 1.47e-03 [+1(3.67e-05)]_39 chr21_31914206_31914256_ 6.45e-04 15_[+1(1.61e-05)]_24 chr21_31933633_31933683_ 2.26e-03 4_[+1(5.65e-05)]_35 chr21_31962741_31962791_ 3.37e-02 50 chr21_31964683_31964733_ 1.95e-04 13_[+1(4.86e-06)]_26 chr21_31973364_31973414_ 5.73e-02 50 chr21_31992870_31992920_ 5.52e-04 16_[+1(1.38e-05)]_23 chr21_32185595_32185645_ 2.59e-04 18_[+1(6.48e-06)]_21 chr21_32202076_32202126_ 1.10e-04 13_[+1(2.74e-06)]_26 chr21_32253899_32253949_ 7.78e-04 17_[+1(1.95e-05)]_22 chr21_32410820_32410870_ 5.52e-04 21_[+1(1.38e-05)]_18 chr21_36411748_36411798_ 2.85e-03 22_[+1(7.15e-05)]_17 chr21_37838750_37838800_ 1.90e-02 50 chr21_45705687_45705737_ 8.63e-04 37_[+1(2.16e-05)]_2 chr21_45971413_45971463_ 1.95e-04 9_[+1(4.86e-06)]_30 chr21_45978668_45978718_ 2.59e-04 4_[+1(6.48e-06)]_35 chr21_45993530_45993580_ 1.95e-04 7_[+1(4.86e-06)]_32 chr21_46020421_46020471_ 7.78e-04 2_[+1(1.95e-05)]_37 chr21_46031920_46031970_ 8.89e-05 15_[+1(2.22e-06)]_24 chr21_46046964_46047014_ 1.80e-05 12_[+1(4.51e-07)]_27 chr21_46057197_46057247_ 1.95e-04 36_[+1(4.86e-06)]_3 chr21_46086869_46086919_ 5.54e-03 50 chr21_46102103_46102153_ 5.54e-03 50 chr21_47517957_47518007_ 2.59e-04 32_[+1(6.48e-06)]_7 chr21_47575506_47575556_ 1.22e-03 30_[+1(3.04e-05)]_9 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because requested number of motifs (1) found. ******************************************************************************** CPU: ThinkPad-T450s ********************************************************************************