comparison GEMBASSY-1.0.3/doc/text/gcai.txt @ 0:8300eb051bea draft

Initial upload
author ktnyt
date Fri, 26 Jun 2015 05:19:29 -0400
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:8300eb051bea
1 gcai
2 Function
3
4 Calculate codon adaptation index for each gene
5
6 Description
7
8 gcai calculates codon adaptation index (CAI) for each gene. CAI is measure
9 a of the relative adaptiveness of the codon usage of a gene towards the
10 codon usage of highly expressed genes, ranging from 0 (no bias) to 1
11 (maximum bias). CAI can be used as a 'universal' measure of codon usage
12 bias as it is correlated with various gene features such as gene expression
13 level, GC content, and GC skew.
14
15 The CAI for a gene where A(i) is the amino acid at position i and W(A) is
16 the W value corresponding to the amino acid is calculated as follows:
17
18 CAI = sum(log(e,W(A(i))))
19
20 G-language SOAP service is provided by the
21 Institute for Advanced Biosciences, Keio University.
22 The original web service is located at the following URL:
23
24 http://www.g-language.org/wiki/soap
25
26 WSDL(RPC/Encoded) file is located at:
27
28 http://soap.g-language.org/g-language.wsdl
29
30 Documentation on G-language Genome Analysis Environment methods are
31 provided at the Document Center
32
33 http://ws.g-language.org/gdoc/
34
35 Usage
36
37 Here is a sample session with gcai
38
39 % gcai refseqn:NC_000913
40 Calculate codon adaptation index for each gene
41 Codon usage output file [nc_000913.gcai]:
42
43 Go to the input files for this example
44 Go to the output files for this example
45
46 Command line arguments
47
48 Standard (Mandatory) qualifiers:
49 [-sequence] seqall Nucleotide sequence(s) filename and optional
50 format, or reference (input USA)
51 [-outfile] outfile [*.gcai] Codon usage output file
52
53 Additional (Optional) qualifiers: (none)
54 Advanced (Unprompted) qualifiers:
55 -translate boolean [N] Include when translating using standard
56 codon table
57 -wabsent string [-1] W value of codons absent from a
58 reference set to negative when excludes such
59 codons from the calculation (Any string)
60 -[no]accid boolean [Y] Include to use sequence accession ID as
61 query
62
63 Associated qualifiers:
64
65 "-sequence" associated qualifiers
66 -sbegin1 integer Start of each sequence to be used
67 -send1 integer End of each sequence to be used
68 -sreverse1 boolean Reverse (if DNA)
69 -sask1 boolean Ask for begin/end/reverse
70 -snucleotide1 boolean Sequence is nucleotide
71 -sprotein1 boolean Sequence is protein
72 -slower1 boolean Make lower case
73 -supper1 boolean Make upper case
74 -scircular1 boolean Sequence is circular
75 -sformat1 string Input sequence format
76 -iquery1 string Input query fields or ID list
77 -ioffset1 integer Input start position offset
78 -sdbname1 string Database name
79 -sid1 string Entryname
80 -ufo1 string UFO features
81 -fformat1 string Features format
82 -fopenfile1 string Features file name
83
84 "-outfile" associated qualifiers
85 -odirectory2 string Output directory
86
87 General qualifiers:
88 -auto boolean Turn off prompts
89 -stdout boolean Write first file to standard output
90 -filter boolean Read first file from standard input, write
91 first file to standard output
92 -options boolean Prompt for standard and additional values
93 -debug boolean Write debug output to program.dbg
94 -verbose boolean Report some/full command line options
95 -help boolean Report command line options and exit. More
96 information on associated and general
97 qualifiers can be found with -help -verbose
98 -warning boolean Report warnings
99 -error boolean Report errors
100 -fatal boolean Report fatal errors
101 -die boolean Report dying program messages
102 -version boolean Report version number and exit
103
104 Input file format
105
106 The database definitions for following commands are available at
107 http://soap.g-language.org/kbws/embossrc
108
109 gcai reads one or more nucleotide sequences.
110
111 Output file format
112
113 The output from gcai is to a plain text file.
114
115 File: nc_000913.gcai
116
117 Sequence: NC_000913
118 cai,gene
119 0.7256,thrL
120 0.4831,thrA
121 0.4719,thrB
122 0.5178,thrC
123 0.4989,yaaX
124 0.4933,yaaA
125 0.4533,yaaJ
126 0.7074,talB
127
128 [Part of this file has been deleted for brevity]
129
130 0.4681,yjjX
131 0.4797,ytjC
132 0.5350,rob
133 0.4932,creA
134 0.3918,creB
135 0.4170,creC
136 0.4167,creD
137 0.6466,arcA
138 0.4236,yjjY
139 0.3913,yjtD
140
141
142 Data files
143
144 None.
145
146 Notes
147
148 None.
149
150 References
151
152 Sharp PM, Li WH. (1987) The codon Adaptation Index--a measure of directional
153 synonymous codon usage bias, and its potential applications.
154 Nucleic Acids Res. 15(3):1281-95.
155
156 Arakawa, K., Mori, K., Ikeda, K., Matsuzaki, T., Konayashi, Y., and
157 Tomita, M. (2003) G-language Genome Analysis Environment: A Workbench
158 for Nucleotide Sequence Data Mining, Bioinformatics, 19, 305-306.
159
160 Arakawa, K. and Tomita, M. (2006) G-language System as a Platform for
161 large-scale analysis of high-throughput omics data, J. Pest Sci.,
162 31, 7.
163
164 Arakawa, K., Kido, N., Oshita, K., Tomita, M. (2010) G-language Genome
165 Analysis Environment with REST and SOAP Web Service Interfaces,
166 Nucleic Acids Res., 38, W700-W705.
167
168 Warnings
169
170 None.
171
172 Diagnostic Error Messages
173
174 None.
175
176 Exit status
177
178 It always exits with a status of 0.
179
180 Known bugs
181
182 None.
183
184 See also
185
186 gp2 Calculate the P2 index of each gene
187 gphx Identify predicted highly expressed gene
188
189 Author(s)
190
191 Hidetoshi Itaya (celery@g-language.org)
192 Institute for Advanced Biosciences, Keio University
193 252-0882 Japan
194
195 Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
196 Institute for Advanced Biosciences, Keio University
197 252-0882 Japan
198
199 History
200
201 2012 - Written by Hidetoshi Itaya
202 2013 - Fixed by Hidetoshi Itaya
203
204 Target users
205
206 This program is intended to be used by everyone and everything, from
207 naive users to embedded scripts.
208
209 Comments
210
211 None.
212