comparison GEMBASSY-1.0.3/doc/text/gkmertable.txt @ 0:8300eb051bea draft

Initial upload
author ktnyt
date Fri, 26 Jun 2015 05:19:29 -0400
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:8300eb051bea
1 gkmertable
2 Function
3
4 Create an image showing all k-mer abundance within a sequence
5
6 Description
7
8 gkmertable creates an image showing the abundance of all k-mers
9 (oligonucleotides of length k) in a given sequence. For example, for
10 tetramers (k=4), resulting image is composed of 4^4 = 256 boxes, each
11 representing an oligomer. Oligomer name and abundance is written within
12 these boxes, and abundance is also visualized with the box color, from
13 white (none) to black (highly frequent).
14
15 This k-mer table is alternatively known as the FCGR (frequency matrices
16 extracted from Chaos Game Representation).
17 Position of the oligomers can be recursively located as follows:
18 For each letter in an oligomer, a box is subdivided into four quadrants,
19 where A is upper left, T is lower right, G is upper right, and C is lower
20 left.
21
22 Therefore, oligomer ATGC is in the
23 A = upper left quadrant
24 T = lower right within the above quadrant
25 G = upper right within the above quadrant
26 C = lower left within the above quadrant
27 More detailed documentation is available at
28 http://www.g-language.org/wiki/cgr
29
30 G-language SOAP service is provided by the
31 Institute for Advanced Biosciences, Keio University.
32 The original web service is located at the following URL:
33
34 http://www.g-language.org/wiki/soap
35
36 WSDL(RPC/Encoded) file is located at:
37
38 http://soap.g-language.org/g-language.wsdl
39
40 Documentation on G-language Genome Analysis Environment methods are
41 provided at the Document Center
42
43 http://ws.g-language.org/gdoc/
44
45 Usage
46
47 Here is a sample session with gkmertable
48
49 % gkmertable refseqn:NC_000913
50 Create an image showing all k-mer abundance within a sequence
51 Created gkmertable.1.png
52
53 Go to the input files for this example
54 Go to the output files for this example
55
56 Command line arguments
57
58 Standard (Mandatory) qualifiers:
59 [-sequence] seqall Nucleotide sequence(s) filename and optional
60 format, or reference (input USA)
61
62 Additional (Optional) qualifiers: (none)
63 Advanced (Unprompted) qualifiers:
64 -format string [png] Output file format. Dependent on
65 'convert' command (Any string)
66 -k integer [6] Length of oligomer (Any integer value)
67 -goutfile string [gkmertable] Output file for non interactive
68 displays (Any string)
69
70 Associated qualifiers:
71
72 "-sequence" associated qualifiers
73 -sbegin1 integer Start of each sequence to be used
74 -send1 integer End of each sequence to be used
75 -sreverse1 boolean Reverse (if DNA)
76 -sask1 boolean Ask for begin/end/reverse
77 -snucleotide1 boolean Sequence is nucleotide
78 -sprotein1 boolean Sequence is protein
79 -slower1 boolean Make lower case
80 -supper1 boolean Make upper case
81 -scircular1 boolean Sequence is circular
82 -sformat1 string Input sequence format
83 -iquery1 string Input query fields or ID list
84 -ioffset1 integer Input start position offset
85 -sdbname1 string Database name
86 -sid1 string Entryname
87 -ufo1 string UFO features
88 -fformat1 string Features format
89 -fopenfile1 string Features file name
90
91 General qualifiers:
92 -auto boolean Turn off prompts
93 -stdout boolean Write first file to standard output
94 -filter boolean Read first file from standard input, write
95 first file to standard output
96 -options boolean Prompt for standard and additional values
97 -debug boolean Write debug output to program.dbg
98 -verbose boolean Report some/full command line options
99 -help boolean Report command line options and exit. More
100 information on associated and general
101 qualifiers can be found with -help -verbose
102 -warning boolean Report warnings
103 -error boolean Report errors
104 -fatal boolean Report fatal errors
105 -die boolean Report dying program messages
106 -version boolean Report version number and exit
107
108 Input file format
109
110 The database definitions for following commands are available at
111 http://soap.g-language.org/kbws/embossrc
112
113 gkmertable reads one or more nucleotide sequences.
114
115 Output file format
116
117 The output from gkmertable is to an image file.
118
119 Data files
120
121 None.
122
123 Notes
124
125 None.
126
127 References
128
129 Arakawa, K., Mori, K., Ikeda, K., Matsuzaki, T., Konayashi, Y., and
130 Tomita, M. (2003) G-language Genome Analysis Environment: A Workbench
131 for Nucleotide Sequence Data Mining, Bioinformatics, 19, 305-306.
132
133 Arakawa, K. and Tomita, M. (2006) G-language System as a Platform for
134 large-scale analysis of high-throughput omics data, J. Pest Sci.,
135 31, 7.
136
137 Arakawa, K., Kido, N., Oshita, K., Tomita, M. (2010) G-language Genome
138 Analysis Environment with REST and SOAP Web Service Interfaces,
139 Nucleic Acids Res., 38, W700-W705.
140
141 Warnings
142
143 None.
144
145 Diagnostic Error Messages
146
147 None.
148
149 Exit status
150
151 It always exits with a status of 0.
152
153 Known bugs
154
155 None.
156
157 See also
158
159 gnucleotideperiodicity Checks the periodicity of certain oligonucleotides
160 goligomercounter Counts the number of given oligomers in a sequence
161 goligomersearch Searches oligomers in given sequence
162 gsignature Calculate oligonucleotide usage (genomic signature)
163
164 Author(s)
165
166 Hidetoshi Itaya (celery@g-language.org)
167 Institute for Advanced Biosciences, Keio University
168 252-0882 Japan
169
170 Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
171 Institute for Advanced Biosciences, Keio University
172 252-0882 Japan
173
174 History
175
176 2012 - Written by Hidetoshi Itaya
177 2013 - Fixed by Hidetoshi Itaya
178
179 Target users
180
181 This program is intended to be used by everyone and everything, from
182 naive users to embedded scripts.
183
184 Comments
185
186 None.
187