gbaserelativeentropy

 

Function

Calculates and graphs the sequence conservation using Kullback-Leibler divergence (relative entropy)

Description

This function calculates and graphs the sequence conservation in regions
around the start/stop codons using Kullback-Leibler divergence (relative
entropy). In realistic conditions, as background nucleotide composition
(e.g. G+C content) varies among species. Kullback-Leibler divergence
calculates the entropy with reduced background noise.

G-language SOAP service is provided by the
Institute for Advanced Biosciences, Keio University.
The original web service is located at the following URL:

http://www.g-language.org/wiki/soap

WSDL(RPC/Encoded) file is located at:

http://soap.g-language.org/g-language.wsdl

Documentation on G-language Genome Analysis Environment methods are
provided at the Document Center

http://ws.g-language.org/gdoc/

Usage

Here is a sample session with gbaserelativeentropy

% gbaserelativeentropy refseqn:NC_000913
Calculates and graphs the sequence conservation using Kullback-Leibler
divergence (relative entropy)
Program compseq output file (optional) [nc_000913.gbaserelativeentropy]: 

Go to the input files for this example
Go to the output files for this example

Example 2

% gbaserelativeentropy refseqn:NC_000913 -plot -graph png
Calculates and graphs the sequence conservation using Kullback-Leibler
divergence (relative entropy)
Created gbaserelativeentropy.1.png

Go to the input files for this example
Go to the output files for this example

Command line arguments

Qualifier Type Description Allowed values Default
Standard (Mandatory) qualifiers
[-sequence]
(Parameter 1)
seqall Nucleotide sequence(s) filename and optional format, or reference (input USA) Readable sequence(s) Required
-graph xygraph Graph type EMBOSS has a list of known devices, including ps, hpgl, hp7470, hp7580, meta, cps, x11, tek, tekt, none, data, xterm, png, gif, svg EMBOSS_GRAPHICS value, or x11
-outfile outfile Program compseq output file (optional) Output file <*>.gbaserelativeentropy
Additional (Optional) qualifiers
(none)
Advanced (Unprompted) qualifiers
-position selection Either 'start' (around start codon) or 'end' (around stop codon) to create the PWM Choose from selection list of values start
-patlen integer Length of oligomer to count Any integer value 3
-upstream integer Length upstream of specified position to create PWM Any integer value 30
-downstream integer Length downstream of specified position to create PWM Any integer value 30
-[no]accid boolean Include to use sequence accession ID as query Boolean value Yes/No Yes
-plot toggle Include to plot result Toggle value Yes/No No

Input file format

The database definitions for following commands are available at
http://soap.g-language.org/kbws/embossrc

gbaserelativeentropy reads one or more nucleotide sequences.

Output file format

The output from gbaserelativeentropy is to a plain text file or the EMBOSS graphics device.

File: nc_000913.gbaserelativeentropy

Sequence: NC_000913
-30,-0.46682
-29,-0.46265
-28,-0.45732
-27,-0.45704
-26,-0.44692
-25,-0.44396
-24,-0.43528
-23,-0.43419
-22,-0.42518

[Part of this file has been deleted for brevity]

21,-0.40010
22,-0.41772
23,-0.42503
24,-0.39675
25,-0.43091
26,-0.43196
27,-0.40576
28,-0.43387
29,-0.41228
30,-0.38869

Data files

None.

Notes

None.

References

   Arakawa, K., Mori, K., Ikeda, K., Matsuzaki, T., Konayashi, Y., and
      Tomita, M. (2003) G-language Genome Analysis Environment: A Workbench
      for Nucleotide Sequence Data Mining, Bioinformatics, 19, 305-306.

   Arakawa, K. and Tomita, M. (2006) G-language System as a Platform for
      large-scale analysis of high-throughput omics data, J. Pest Sci.,
      31, 7.

   Arakawa, K., Kido, N., Oshita, K., Tomita, M. (2010) G-language Genome
      Analysis Environment with REST and SOAP Web Service Interfaces,
      Nucleic Acids Res., 38, W700-W705.

Warnings

None.

Diagnostic Error Messages

None.

Exit status

It always exits with a status of 0.

Known bugs

None.

See also

Program name Description
gbase_entropy Calculates and graphs the sequence conservation using Shanon uncertainty (entropy)
gbase_information_content Calculates and graphs the sequence conservation using information content

Author(s)

Hidetoshi Itaya (celery@g-language.org)
  Institute for Advanced Biosciences, Keio University
  252-0882 Japan

Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
  Institute for Advanced Biosciences, Keio University
  252-0882 Japan

History

2012 - Written by Hidetoshi Itaya

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scrips.

Comments

None.