gdinuc

 

Function

Calculates dinucleotide usage

Description

gdinuc calculates dinucleotide usage indices for protein-coding sequences
(excluding start and stop codons). Dinucleotide usage is computed as the
ratio of observed (O) to expected (E) dinucleotide frequencies within the
given sequence. Dinucleotides are known to have consistent patterns within
the genome (signatures) and tend to have certain periodicities.

G-language SOAP service is provided by the
Institute for Advanced Biosciences, Keio University.
The original web service is located at the following URL:

http://www.g-language.org/wiki/soap

WSDL(RPC/Encoded) file is located at:

http://soap.g-language.org/g-language.wsdl

Documentation on G-language Genome Analysis Environment methods are
provided at the Document Center

http://ws.g-language.org/gdoc/

Usage

Here is a sample session with gdinuc

% gdinuc refseqn:NC_000913
Calculates dinucleotide usage
Program compseq output file [nc_000913.gdinuc]: 

Go to the input files for this example
Go to the output files for this example

Command line arguments

Qualifier Type Description Allowed values Default
Standard (Mandatory) qualifiers
[-sequence]
(Parameter 1)
seqall Nucleotide sequence(s) filename and optional format, or reference (input USA) Readable sequence(s) Required
[-outfile]
(Parameter 2)
outfile Program compseq output file Output file <*>.gdinuc
Additional (Optional) qualifiers
(none)
Advanced (Unprompted) qualifiers
-translate boolean Include when translates using standard codon table Boolean value Yes/No No
-position list Codon position or reading frame
all (Assess all codon positions)
12 (Assess the reading frame 1-2)
23 (Assess the reading frame 2-3)
31 (Assess the reading frame 3-1)
all
-delkey string Regular expression to delete key (i.e. amino acids and nucleotides) Any string [^ACDEFGHIKLMNPQRSTVWYacgtU]
-[no]accid boolean Include to use sequence accession ID as query Boolean value Yes/No Yes

Input file format

The database definitions for following commands are available at
http://soap.g-language.org/kbws/embossrc

gdinuc reads one or more nucleotide sequences.

Output file format

The output from gdinuc is to a plain text file.

File: nc_000913.gdinuc

Sequence: NC_000913

?
keys,aa,ac,ag,at,ca,cc,cg,ct,ga,gc,gg,gt,ta,tc,tg,tt,gene,
All,1.293,0.921,0.720,1.108,1.022,0.868,1.166,0.925,0.958,1.285,0.897,0.867,0.729,0.891,1.228,1.123,

Data files

None.

Notes

None.

References

   Yew et al. (2004) Base usage and dinucleotide frequency of infectious
      bursal disease virus, Virus Genes, 28:1,41-53.

   Arakawa, K., Mori, K., Ikeda, K., Matsuzaki, T., Konayashi, Y., and
      Tomita, M. (2003) G-language Genome Analysis Environment: A Workbench
      for Nucleotide Sequence Data Mining, Bioinformatics, 19, 305-306.

   Arakawa, K. and Tomita, M. (2006) G-language System as a Platform for
      large-scale analysis of high-throughput omics data, J. Pest Sci.,
      31, 7.

   Arakawa, K., Kido, N., Oshita, K., Tomita, M. (2010) G-language Genome
      Analysis Environment with REST and SOAP Web Service Interfaces,
      Nucleic Acids Res., 38, W700-W705.

Warnings

None.

Diagnostic Error Messages

None.

Exit status

It always exits with a status of 0.

Known bugs

None.

See also

Program name Description
gbui Calculates base usage indices for protein-coding sequences

Author(s)

Hidetoshi Itaya (celery@g-language.org)
  Institute for Advanced Biosciences, Keio University
  252-0882 Japan

Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
  Institute for Advanced Biosciences, Keio University
  252-0882 Japan

History

2012 - Written by Hidetoshi Itaya

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scrips.

Comments

None.