gsignature

 

Function

Calculate oligonucleotide usage (genomic signature)

Description

gsignature calculates short oligonuleotide usage (genomic signture),
defined as the ratio of observed (O) to expected (E) oligonucleotide
frequencies. It is known that the genomic signature stays constant
throughout the genome.

G-language SOAP service is provided by the
Institute for Advanced Biosciences, Keio University.
The original web service is located at the following URL:

http://www.g-language.org/wiki/soap

WSDL(RPC/Encoded) file is located at:

http://soap.g-language.org/g-language.wsdl

Documentation on G-language Genome Analysis Environment methods are
provided at the Document Center

http://ws.g-language.org/gdoc/

Usage

Here is a sample session with gsignature

% gsignature refseqn:NC_000913
Calculate oligonucleotide usage (genomic signature)
Program compseq output file [nc_000913.gsignature]: 

Go to the input files for this example
Go to the output files for this example

Command line arguments

Qualifier Type Description Allowed values Default
Standard (Mandatory) qualifiers
[-sequence]
(Parameter 1)
seqall Nucleotide sequence(s) filename and optional format, or reference (input USA) Readable sequence(s) Required
[-outfile]
(Parameter 2)
outfile Program compseq output file Output file <*>.gsignature
Additional (Optional) qualifiers
(none)
Advanced (Unprompted) qualifiers
-wordlength integer Word length Any integer value 2
-[no]bothstrand boolean Include to use both strands direct used otherwise Boolean value Yes/No Yes
-[no]oe boolean Include to use O/E value observed values used otherwise Boolean value Yes/No Yes
-[no]accid boolean Include to use sequence accession ID as query Boolean value Yes/No Yes

Input file format

The database definitions for following commands are available at
http://soap.g-language.org/kbws/embossrc

gsignature reads one or more nucleotide sequences.

Output file format

The output from gsignature is to a plain text file.

File: nc_000913.gsignature

Sequence: NC_000913
aa,ac,ag,at,ca,cc,cg,ct,ga,gc,gg,gt,ta,tc,tg,tt,memo
1.206,0.884,0.817,1.103,1.117,0.905,1.159,0.817,0.922,1.283,0.905,0.884,0.755,0.922,1.117,1.206,

Data files

None.

Notes

None.

References

   Campbell A et al. (1999) Genome signature comparisons among prokaryote,
      plasmid, and mitochondrial DNA, Proc Natl Acad Sci U S A. 96(16):9184-9.

   Karlin S. (2001) Detecting anomalous gene clusters and pathogenicity islands
      in diverse bacterial genomes, Trends Microbiol. 9(7):335-43.
   Arakawa, K., Mori, K., Ikeda, K., Matsuzaki, T., Konayashi, Y., and
      Tomita, M. (2003) G-language Genome Analysis Environment: A Workbench
      for Nucleotide Sequence Data Mining, Bioinformatics, 19, 305-306.

   Arakawa, K. and Tomita, M. (2006) G-language System as a Platform for
      large-scale analysis of high-throughput omics data, J. Pest Sci.,
      31, 7.

   Arakawa, K., Kido, N., Oshita, K., Tomita, M. (2010) G-language Genome
      Analysis Environment with REST and SOAP Web Service Interfaces,
      Nucleic Acids Res., 38, W700-W705.

Warnings

None.

Diagnostic Error Messages

None.

Exit status

It always exits with a status of 0.

Known bugs

None.

See also

Program name Description
gkmertable Create an image showing all k-mer abundance within a sequence
gnucleotideperiodicity Checks the periodicity of certain oligonucleotides
goligomercounter Counts the number of given oligomers in a sequence
goligomersearch Searches oligomers in given sequence

Author(s)

   Hidetoshi Itaya (celery@g-language.org)
   Institute for Advanced Biosciences, Keio University
   252-0882 Japan

  Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
  Institute for Advanced Biosciences, Keio University
  252-0882 Japan

History

2012 - Written by Hidetoshi Itaya

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scrips.

Comments

None.