Mercurial > repos > ktnyt > gembassy
view GEMBASSY-1.0.3/doc/text/gcodoncompiler.txt @ 0:8300eb051bea draft
Initial upload
author | ktnyt |
---|---|
date | Fri, 26 Jun 2015 05:19:29 -0400 |
parents | |
children |
line wrap: on
line source
gcodoncompiler Function Calculate various kinds of amino acid and codon usage data Description gcodoncompiler calculates various kinds of amino acid and codon usage data. The following values are calculable: A0: Absolute amino acid frequency A1: Relative amino acid frequency C0: Absolute codon frequency C1: Relative codon frequency in a complete sequence C2: Relative codon frequency in each amino acid C3: Relative synonymous codon usage C4: Relative adaptiveness C5: Maximum or minor codon For amino acids unpresent in a gene, C2-C3 does not calculate the values. By using R* in place, such values are hypothesized that alternative synonymous codons are used with equal frequency. G-language SOAP service is provided by the Institute for Advanced Biosciences, Keio University. The original web service is located at the following URL: http://www.g-language.org/wiki/soap WSDL(RPC/Encoded) file is located at: http://soap.g-language.org/g-language.wsdl Documentation on G-language Genome Analysis Environment methods are provided at the Document Center http://ws.g-language.org/gdoc/ Usage Here is a sample session with gcodoncompiler % gcodoncompiler refseqn:NC_000913 Calculate various kinds of amino acid and codon usage data Codon usage output file [nc_000913.gcodoncompiler]: Go to the input files for this example Go to the output files for this example Command line arguments Standard (Mandatory) qualifiers: [-sequence] seqall Nucleotide sequence(s) filename and optional format, or reference (input USA) [-outfile] outfile [*.gcodoncompiler] Codon usage output file Additional (Optional) qualifiers: (none) Advanced (Unprompted) qualifiers: -translate boolean [N] Include to translate using standard codon table -startcodon boolean [N] Include to include start codon -stopcodon boolean [N] Include to include stop codon -delkey string [[^ACDEFGHIKLMNPQRSTVWYacgtU]] Regular expression to delete key (i.e. amino acids and nucleotides) (Any string) -data menu [R0] Kinds of codon usage data. R* hypothesizes amino acids which are not present in the gene (Values: A0 (Absolute amino acid frequency ('AA')); A1 (Relative amino acid frequency ('RAAU')); C0 (Absolute codon frequency ('AF')); C1 (Relative codon frequency in a complete sequence); C2 (Relative codon frequency in each amino acid ('RF')); C3 (Relative synonymous codon usage ('RSCU')); C4 (Relative adaptiveness); i.e., ratio to maximum of minor codon ('W') C5 (Maximum (1) or minor (0) codon); R0 (Absolute codon frequency ('AF')); R1 (Relative codon frequency in a complete sequence); R2 (Relative codon frequency in each amino acid ('RF')); R3 (Relative synonymous codon usage ('RSCU')); R4 (Relative adaptiveness); i.e., ratio to maximum of minor codon ('W') R5 (Maximum (1) or minor (0) codon)) -[no]accid boolean [Y] Include to use sequence accession ID as query Associated qualifiers: "-sequence" associated qualifiers -sbegin1 integer Start of each sequence to be used -send1 integer End of each sequence to be used -sreverse1 boolean Reverse (if DNA) -sask1 boolean Ask for begin/end/reverse -snucleotide1 boolean Sequence is nucleotide -sprotein1 boolean Sequence is protein -slower1 boolean Make lower case -supper1 boolean Make upper case -scircular1 boolean Sequence is circular -sformat1 string Input sequence format -iquery1 string Input query fields or ID list -ioffset1 integer Input start position offset -sdbname1 string Database name -sid1 string Entryname -ufo1 string UFO features -fformat1 string Features format -fopenfile1 string Features file name "-outfile" associated qualifiers -odirectory2 string Output directory General qualifiers: -auto boolean Turn off prompts -stdout boolean Write first file to standard output -filter boolean Read first file from standard input, write first file to standard output -options boolean Prompt for standard and additional values -debug boolean Write debug output to program.dbg -verbose boolean Report some/full command line options -help boolean Report command line options and exit. More information on associated and general qualifiers can be found with -help -verbose -warning boolean Report warnings -error boolean Report errors -fatal boolean Report fatal errors -die boolean Report dying program messages -version boolean Report version number and exit Input file format The database definitions for following commands are available at http://soap.g-language.org/kbws/embossrc gcodoncompiler reads one or more nucleotide sequences. Output file format The output from gcodoncompiler is to a plain text file. File: nc_000913.gcodoncompiler Sequence: NC_000913 Agca,Agcc,Agcg,Agct,Ctgc,Ctgt,Dgac,Dgat,Egaa,Egag,Fttc,Fttt,Ggga,Gggc,Gggg,Gggt,Hcac,Hcat,Iata,Iatc,Iatt,Kaaa,Kaag,Lcta,Lctc,Lctg,Lctt,Ltta,Lttg,Matg,Naac,Naat,Pcca,Pccc,Pccg,Pcct,Qcaa,Qcag,Raga,Ragg,Rcga,Rcgc,Rcgg,Rcgt,Sagc,Sagt,Stca,Stcc,Stcg,Stct,Taca,Tacc,Tacg,Tact,Utga,Vgta,Vgtc,Vgtg,Vgtt,Wtgg,Ytac,Ytat,locus_tag 26551,33911,44924,20010,8486,6707,25234,42161,52362,23474,21841,29334,10226,39395,14472,32678,12830,16952,5356,33359,40221,44272,13398,5079,14709,70441,14410,18097,17936,32971,28329,22786,11063,7142,30994,9130,20216,38169,2495,1366,4529,29308,6991,27864,21132,11323,9159,11332,11759,10992,8979,31001,18989,11581,3,14337,20240,34499,24056,20071,16088,21069, Data files None. Notes None. References Arakawa, K., Mori, K., Ikeda, K., Matsuzaki, T., Konayashi, Y., and Tomita, M. (2003) G-language Genome Analysis Environment: A Workbench for Nucleotide Sequence Data Mining, Bioinformatics, 19, 305-306. Arakawa, K. and Tomita, M. (2006) G-language System as a Platform for large-scale analysis of high-throughput omics data, J. Pest Sci., 31, 7. Arakawa, K., Kido, N., Oshita, K., Tomita, M. (2010) G-language Genome Analysis Environment with REST and SOAP Web Service Interfaces, Nucleic Acids Res., 38, W700-W705. Warnings None. Diagnostic Error Messages None. Exit status It always exits with a status of 0. Known bugs None. See also gaminoinfo Prints out basic amino acid sequence statistics gaaui Calculates various indece of amino acid usage Author(s) Hidetoshi Itaya (celery@g-language.org) Institute for Advanced Biosciences, Keio University 252-0882 Japan Kazuharu Arakawa (gaou@sfc.keio.ac.jp) Institute for Advanced Biosciences, Keio University 252-0882 Japan History 2012 - Written by Hidetoshi Itaya 2013 - Fixed by Hidetoshi Itaya Target users This program is intended to be used by everyone and everything, from naive users to embedded scripts. Comments None.