gphx

 

Function

Identify predicted highly expressed gene

Description

gphx calculates codon usage differences between gene classes for identifying
Predicted Highly eXpressed (PHX) and Putative Alien (PA) genes. A gene is
identified as PHX if BgC/BgH >= 1, where BgC and BgH is a value < 1 by it's
nature. PHX genes are known to generally have favorable codon usage, strong
SD sequences, and probably stronger conservation of promoter sequences.
A gene is idenfitied as PA if BgC and BgH is greater than the median of
BgC for every gene with a length close to the gene.

G-language SOAP service is provided by the
Institute for Advanced Biosciences, Keio University.
The original web service is located at the following URL:

http://www.g-language.org/wiki/soap

WSDL(RPC/Encoded) file is located at:

http://soap.g-language.org/g-language.wsdl

Documentation on G-language Genome Analysis Environment methods are
provided at the Document Center

http://ws.g-language.org/gdoc/

Usage

Here is a sample session with gphx

% gphx refseqn:NC_000913
Identify predicted highly expressed gene
Codon usage output file [nc_000913.gphx]: 

Go to the input files for this example
Go to the output files for this example

Command line arguments

Qualifier Type Description Allowed values Default
Standard (Mandatory) qualifiers
[-sequence]
(Parameter 1)
seqall Nucleotide sequence(s) filename and optional format, or reference (input USA) Readable sequence(s) Required
[-outfile]
(Parameter 2)
outfile Codon usage output file Output file <*>.gphx
Additional (Optional) qualifiers
(none)
Advanced (Unprompted) qualifiers
-translate boolean Include when translating using standard codon table Boolean value Yes/No No
-delkey string Regular expression to delete key Any string [^ACDEFGHIKLMNPQRSTVWYacgtU]
-[no]accid boolean Include to use sequence accession ID as query Boolean value Yes/No Yes

Input file format

The database definitions for following commands are available at
http://soap.g-language.org/kbws/embossrc

gphx reads one or more nucleotide sequences.

Output file format

The output from gphx is to a plain text file.

File: nc_000913.gphx

Sequence: NC_000913
BgC,BgH,E_g,phx,pa,gene
0.8070,0.8977,0.8990,0,1,thrL
0.1857,0.5958,0.3116,0,0,thrA
0.2323,0.5964,0.3896,0,0,thrB
0.2353,0.6064,0.3881,0,0,thrC
0.4353,0.6020,0.7231,0,1,yaaX
0.2961,0.6790,0.4361,0,0,yaaA
0.2233,0.7009,0.3186,0,0,yaaJ
0.4149,0.3071,1.3511,1,0,talB

[Part of this file has been deleted for brevity]

0.3255,0.7038,0.4625,0,0,yjjX
0.3531,0.5906,0.5979,0,0,ytjC
0.2257,0.5235,0.4311,0,0,rob
0.3584,0.6809,0.5264,0,0,creA
0.3455,0.7950,0.4346,0,0,creB
0.2298,0.7154,0.3212,0,0,creC
0.3299,0.7916,0.4167,0,0,creD
0.3543,0.3786,0.9357,0,0,arcA
0.7295,0.8286,0.8804,0,1,yjjY
0.4028,0.8401,0.4795,0,0,yjtD

Data files

None.

Notes

None.

References

   CMBL- PHX/PA user guide http://www.cmbl.uga.edu/software/PHX-PA-guide.htm

   Henry I., Sharp PM. (2007) Predicting gene expression level from codon
      usage bias Mol Biol Evol, 24(1):10-2.

   Karlin S., and Mrazek J. (2000) Predicted highly expressed genes of diverse
      prokaryotic genomes J.Bacteriol, 182(18):5238-5250.

   Arakawa, K., Mori, K., Ikeda, K., Matsuzaki, T., Konayashi, Y., and
      Tomita, M. (2003) G-language Genome Analysis Environment: A Workbench
      for Nucleotide Sequence Data Mining, Bioinformatics, 19, 305-306.

   Arakawa, K. and Tomita, M. (2006) G-language System as a Platform for
      large-scale analysis of high-throughput omics data, J. Pest Sci.,
      31, 7.

   Arakawa, K., Kido, N., Oshita, K., Tomita, M. (2010) G-language Genome
      Analysis Environment with REST and SOAP Web Service Interfaces,
      Nucleic Acids Res., 38, W700-W705.

Warnings

None.

Diagnostic Error Messages

None.

Exit status

It always exits with a status of 0.

Known bugs

None.

See also

Program name Description
gcai Calculate codon adaptation index for each gene
gp2 Calculate the P2 index of each gene

Author(s)

Hidetoshi Itaya (celery@g-language.org)
  Institute for Advanced Biosciences, Keio University
  252-0882 Japan

Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
  Institute for Advanced Biosciences, Keio University
  252-0882 Japan

History

2012 - Written by Hidetoshi Itaya

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scrips.

Comments

None.