gkmertable

 

Function

Create an image showing all k-mer abundance within a sequence

Description

gkmertable creates an image showing the abundance of all k-mers
(oligonucleotides of length k) in a given sequence. For example, for
tetramers (k=4), resulting image is composed of 4^4 = 256 boxes, each
representing an oligomer. Oligomer name and abundance is written within
these boxes, and abundance is also visualized with the box color, from
white (none) to black (highly frequent).
This k-mer table is alternatively known as the FCGR (frequency matrices
extracted from Chaos Game Representation).
Position of the oligomers can be recursively located as follows:
For each letter in an oligomer, a box is subdivided into four quadrants,
where A is upper left, T is lower right, G is upper right, and C is lower
left.
Therefore, oligomer ATGC is in the
A = upper left quadrant
T = lower right within the above quadrant
G = upper right within the above quadrant
C = lower left within the above quadrant
More detailed documentation is available at
http://www.g-language.org/wiki/cgr

G-language SOAP service is provided by the
Institute for Advanced Biosciences, Keio University.
The original web service is located at the following URL:

http://www.g-language.org/wiki/soap

WSDL(RPC/Encoded) file is located at:

http://soap.g-language.org/g-language.wsdl

Documentation on G-language Genome Analysis Environment methods are
provided at the Document Center

http://ws.g-language.org/gdoc/

Usage


Here is a sample session with gkmertable

% gkmertable refseqn:NC_000913
Create an image showing all k-mer abundance within a sequence
Created gkmertable.1.png

   Go to the input files for this example
   Go to the output files for this example



Command line arguments

Qualifier Type Description Allowed values Default
Standard (Mandatory) qualifiers
[-sequence]
(Parameter 1)
seqall Nucleotide sequence(s) filename and optional format, or reference (input USA) Readable sequence(s) Required
Additional (Optional) qualifiers
(none)
Advanced (Unprompted) qualifiers
-format string Output file format. Dependent on 'convert' command Any string png
-k integer Length of oligomer Any integer value 6
-goutfile string Output file for non interactive displays Any string gkmertable

Input file format

The database definitions for following commands are available at
http://soap.g-language.org/kbws/embossrc

gkmertable reads one or more nucleotide sequences.

Output file format

The output from gkmertable is to an image file.

Data files

None.

Notes

None.

References

   Arakawa, K., Mori, K., Ikeda, K., Matsuzaki, T., Konayashi, Y., and
      Tomita, M. (2003) G-language Genome Analysis Environment: A Workbench
      for Nucleotide Sequence Data Mining, Bioinformatics, 19, 305-306.

   Arakawa, K. and Tomita, M. (2006) G-language System as a Platform for
      large-scale analysis of high-throughput omics data, J. Pest Sci.,
      31, 7.

   Arakawa, K., Kido, N., Oshita, K., Tomita, M. (2010) G-language Genome
      Analysis Environment with REST and SOAP Web Service Interfaces,
      Nucleic Acids Res., 38, W700-W705.

Warnings

None.

Diagnostic Error Messages

None.

Exit status

It always exits with a status of 0.

Known bugs

None.

See also

Program name Description
gnucleotideperiodicity Checks the periodicity of certain oligonucleotides
goligomercounter Counts the number of given oligomers in a sequence
goligomersearch Searches oligomers in given sequence
gsignature Calculate oligonucleotide usage (genomic signature)

Author(s)

   Hidetoshi Itaya (celery@g-language.org)
   Institute for Advanced Biosciences, Keio University
   252-0882 Japan

  Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
  Institute for Advanced Biosciences, Keio University
  252-0882 Japan

History

2012 - Written by Hidetoshi Itaya

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scrips.

Comments

None.