comparison GEMBASSY-1.0.3/doc/text/gsignature.txt @ 0:8300eb051bea draft

Initial upload
author ktnyt
date Fri, 26 Jun 2015 05:19:29 -0400
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:8300eb051bea
1 gsignature
2 Function
3
4 Calculate oligonucleotide usage (genomic signature)
5
6 Description
7
8 gsignature calculates short oligonuleotide usage (genomic signture),
9 defined as the ratio of observed (O) to expected (E) oligonucleotide
10 frequencies. It is known that the genomic signature stays constant
11 throughout the genome.
12
13 G-language SOAP service is provided by the
14 Institute for Advanced Biosciences, Keio University.
15 The original web service is located at the following URL:
16
17 http://www.g-language.org/wiki/soap
18
19 WSDL(RPC/Encoded) file is located at:
20
21 http://soap.g-language.org/g-language.wsdl
22
23 Documentation on G-language Genome Analysis Environment methods are
24 provided at the Document Center
25
26 http://ws.g-language.org/gdoc/
27
28 Usage
29
30 Here is a sample session with gsignature
31
32 % gsignature refseqn:NC_000913
33 Calculate oligonucleotide usage (genomic signature)
34 Program compseq output file [nc_000913.gsignature]:
35
36 Go to the input files for this example
37 Go to the output files for this example
38
39 Command line arguments
40
41 Standard (Mandatory) qualifiers:
42 [-sequence] seqall Nucleotide sequence(s) filename and optional
43 format, or reference (input USA)
44 [-outfile] outfile [*.gsignature] Program compseq output file
45
46 Additional (Optional) qualifiers: (none)
47 Advanced (Unprompted) qualifiers:
48 -wordlength integer [2] Word length (Any integer value)
49 -[no]bothstrand boolean [Y] Include to use both strands direct used
50 otherwise
51 -[no]oe boolean [Y] Include to use O/E value observed values
52 used otherwise
53 -[no]accid boolean [Y] Include to use sequence accession ID as
54 query
55
56 Associated qualifiers:
57
58 "-sequence" associated qualifiers
59 -sbegin1 integer Start of each sequence to be used
60 -send1 integer End of each sequence to be used
61 -sreverse1 boolean Reverse (if DNA)
62 -sask1 boolean Ask for begin/end/reverse
63 -snucleotide1 boolean Sequence is nucleotide
64 -sprotein1 boolean Sequence is protein
65 -slower1 boolean Make lower case
66 -supper1 boolean Make upper case
67 -scircular1 boolean Sequence is circular
68 -sformat1 string Input sequence format
69 -iquery1 string Input query fields or ID list
70 -ioffset1 integer Input start position offset
71 -sdbname1 string Database name
72 -sid1 string Entryname
73 -ufo1 string UFO features
74 -fformat1 string Features format
75 -fopenfile1 string Features file name
76
77 "-outfile" associated qualifiers
78 -odirectory2 string Output directory
79
80 General qualifiers:
81 -auto boolean Turn off prompts
82 -stdout boolean Write first file to standard output
83 -filter boolean Read first file from standard input, write
84 first file to standard output
85 -options boolean Prompt for standard and additional values
86 -debug boolean Write debug output to program.dbg
87 -verbose boolean Report some/full command line options
88 -help boolean Report command line options and exit. More
89 information on associated and general
90 qualifiers can be found with -help -verbose
91 -warning boolean Report warnings
92 -error boolean Report errors
93 -fatal boolean Report fatal errors
94 -die boolean Report dying program messages
95 -version boolean Report version number and exit
96
97 Input file format
98
99 The database definitions for following commands are available at
100 http://soap.g-language.org/kbws/embossrc
101
102 gsignature reads one or more nucleotide sequences.
103
104 Output file format
105
106 The output from gsignature is to a plain text file.
107
108 File: nc_000913.gsignature
109
110 Sequence: NC_000913
111 aa,ac,ag,at,ca,cc,cg,ct,ga,gc,gg,gt,ta,tc,tg,tt,memo
112 1.206,0.884,0.817,1.103,1.117,0.905,1.159,0.817,0.922,1.283,0.905,0.884,0.755,0.922,1.117,1.206,
113
114
115 Data files
116
117 None.
118
119 Notes
120
121 None.
122
123 References
124
125 Campbell A et al. (1999) Genome signature comparisons among prokaryote,
126 plasmid, and mitochondrial DNA, Proc Natl Acad Sci U S A. 96(16):9184-9.
127
128 Karlin S. (2001) Detecting anomalous gene clusters and pathogenicity islands
129 in diverse bacterial genomes, Trends Microbiol. 9(7):335-43.
130 Arakawa, K., Mori, K., Ikeda, K., Matsuzaki, T., Konayashi, Y., and
131 Tomita, M. (2003) G-language Genome Analysis Environment: A Workbench
132 for Nucleotide Sequence Data Mining, Bioinformatics, 19, 305-306.
133
134 Arakawa, K. and Tomita, M. (2006) G-language System as a Platform for
135 large-scale analysis of high-throughput omics data, J. Pest Sci.,
136 31, 7.
137
138 Arakawa, K., Kido, N., Oshita, K., Tomita, M. (2010) G-language Genome
139 Analysis Environment with REST and SOAP Web Service Interfaces,
140 Nucleic Acids Res., 38, W700-W705.
141
142 Warnings
143
144 None.
145
146 Diagnostic Error Messages
147
148 None.
149
150 Exit status
151
152 It always exits with a status of 0.
153
154 Known bugs
155
156 None.
157
158 See also
159
160 gkmer_table Create an image showing all k-mer abundance within a sequence
161 gnucleotide_periodicity Checks the periodicity of certain oligonucleotides
162 goligomer_counter Counts the number of given oligomers in a sequence
163 goligomer_search Searches oligomers in given sequence
164
165 Author(s)
166
167 Hidetoshi Itaya (celery@g-language.org)
168 Institute for Advanced Biosciences, Keio University
169 252-0882 Japan
170
171 Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
172 Institute for Advanced Biosciences, Keio University
173 252-0882 Japan
174
175 History
176
177 2012 - Written by Hidetoshi Itaya
178 2013 - Fixed by Hidetoshi Itaya
179
180 Target users
181
182 This program is intended to be used by everyone and everything, from
183 naive users to embedded scripts.
184
185 Comments
186
187 None.
188