Mercurial > repos > ktnyt > gembassy
diff GEMBASSY-1.0.3/doc/text/gcodoncompiler.txt @ 2:8947fca5f715 draft default tip
Uploaded
author | ktnyt |
---|---|
date | Fri, 26 Jun 2015 05:21:44 -0400 |
parents | 84a17b3fad1f |
children |
line wrap: on
line diff
--- a/GEMBASSY-1.0.3/doc/text/gcodoncompiler.txt Fri Jun 26 05:20:29 2015 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,213 +0,0 @@ - gcodoncompiler -Function - - Calculate various kinds of amino acid and codon usage data - -Description - - gcodoncompiler calculates various kinds of amino acid and codon usage data. - The following values are calculable: - A0: Absolute amino acid frequency - A1: Relative amino acid frequency - C0: Absolute codon frequency - C1: Relative codon frequency in a complete sequence - C2: Relative codon frequency in each amino acid - C3: Relative synonymous codon usage - C4: Relative adaptiveness - C5: Maximum or minor codon - - For amino acids unpresent in a gene, C2-C3 does not calculate the values. - By using R* in place, such values are hypothesized that alternative - synonymous codons are used with equal frequency. - - G-language SOAP service is provided by the - Institute for Advanced Biosciences, Keio University. - The original web service is located at the following URL: - - http://www.g-language.org/wiki/soap - - WSDL(RPC/Encoded) file is located at: - - http://soap.g-language.org/g-language.wsdl - - Documentation on G-language Genome Analysis Environment methods are - provided at the Document Center - - http://ws.g-language.org/gdoc/ - -Usage - -Here is a sample session with gcodoncompiler - -% gcodoncompiler refseqn:NC_000913 -Calculate various kinds of amino acid and codon usage data -Codon usage output file [nc_000913.gcodoncompiler]: - - Go to the input files for this example - Go to the output files for this example - -Command line arguments - - Standard (Mandatory) qualifiers: - [-sequence] seqall Nucleotide sequence(s) filename and optional - format, or reference (input USA) - [-outfile] outfile [*.gcodoncompiler] Codon usage output file - - Additional (Optional) qualifiers: (none) - Advanced (Unprompted) qualifiers: - -translate boolean [N] Include to translate using standard - codon table - -startcodon boolean [N] Include to include start codon - -stopcodon boolean [N] Include to include stop codon - -delkey string [[^ACDEFGHIKLMNPQRSTVWYacgtU]] Regular - expression to delete key (i.e. amino acids - and nucleotides) (Any string) - -data menu [R0] Kinds of codon usage data. R* - hypothesizes amino acids which are not - present in the gene (Values: A0 (Absolute - amino acid frequency ('AA')); A1 (Relative - amino acid frequency ('RAAU')); C0 (Absolute - codon frequency ('AF')); C1 (Relative codon - frequency in a complete sequence); C2 - (Relative codon frequency in each amino acid - ('RF')); C3 (Relative synonymous codon - usage ('RSCU')); C4 (Relative adaptiveness); - i.e., ratio to maximum of minor codon ('W') - C5 (Maximum (1) or minor (0) codon); R0 - (Absolute codon frequency ('AF')); R1 - (Relative codon frequency in a complete - sequence); R2 (Relative codon frequency in - each amino acid ('RF')); R3 (Relative - synonymous codon usage ('RSCU')); R4 - (Relative adaptiveness); i.e., ratio to - maximum of minor codon ('W') R5 (Maximum (1) - or minor (0) codon)) - -[no]accid boolean [Y] Include to use sequence accession ID as - query - - Associated qualifiers: - - "-sequence" associated qualifiers - -sbegin1 integer Start of each sequence to be used - -send1 integer End of each sequence to be used - -sreverse1 boolean Reverse (if DNA) - -sask1 boolean Ask for begin/end/reverse - -snucleotide1 boolean Sequence is nucleotide - -sprotein1 boolean Sequence is protein - -slower1 boolean Make lower case - -supper1 boolean Make upper case - -scircular1 boolean Sequence is circular - -sformat1 string Input sequence format - -iquery1 string Input query fields or ID list - -ioffset1 integer Input start position offset - -sdbname1 string Database name - -sid1 string Entryname - -ufo1 string UFO features - -fformat1 string Features format - -fopenfile1 string Features file name - - "-outfile" associated qualifiers - -odirectory2 string Output directory - - General qualifiers: - -auto boolean Turn off prompts - -stdout boolean Write first file to standard output - -filter boolean Read first file from standard input, write - first file to standard output - -options boolean Prompt for standard and additional values - -debug boolean Write debug output to program.dbg - -verbose boolean Report some/full command line options - -help boolean Report command line options and exit. More - information on associated and general - qualifiers can be found with -help -verbose - -warning boolean Report warnings - -error boolean Report errors - -fatal boolean Report fatal errors - -die boolean Report dying program messages - -version boolean Report version number and exit - -Input file format - - The database definitions for following commands are available at - http://soap.g-language.org/kbws/embossrc - - gcodoncompiler reads one or more nucleotide sequences. - -Output file format - - The output from gcodoncompiler is to a plain text file. - - File: nc_000913.gcodoncompiler - -Sequence: NC_000913 -Agca,Agcc,Agcg,Agct,Ctgc,Ctgt,Dgac,Dgat,Egaa,Egag,Fttc,Fttt,Ggga,Gggc,Gggg,Gggt,Hcac,Hcat,Iata,Iatc,Iatt,Kaaa,Kaag,Lcta,Lctc,Lctg,Lctt,Ltta,Lttg,Matg,Naac,Naat,Pcca,Pccc,Pccg,Pcct,Qcaa,Qcag,Raga,Ragg,Rcga,Rcgc,Rcgg,Rcgt,Sagc,Sagt,Stca,Stcc,Stcg,Stct,Taca,Tacc,Tacg,Tact,Utga,Vgta,Vgtc,Vgtg,Vgtt,Wtgg,Ytac,Ytat,locus_tag -26551,33911,44924,20010,8486,6707,25234,42161,52362,23474,21841,29334,10226,39395,14472,32678,12830,16952,5356,33359,40221,44272,13398,5079,14709,70441,14410,18097,17936,32971,28329,22786,11063,7142,30994,9130,20216,38169,2495,1366,4529,29308,6991,27864,21132,11323,9159,11332,11759,10992,8979,31001,18989,11581,3,14337,20240,34499,24056,20071,16088,21069, - - -Data files - - None. - -Notes - - None. - -References - - Arakawa, K., Mori, K., Ikeda, K., Matsuzaki, T., Konayashi, Y., and - Tomita, M. (2003) G-language Genome Analysis Environment: A Workbench - for Nucleotide Sequence Data Mining, Bioinformatics, 19, 305-306. - - Arakawa, K. and Tomita, M. (2006) G-language System as a Platform for - large-scale analysis of high-throughput omics data, J. Pest Sci., - 31, 7. - - Arakawa, K., Kido, N., Oshita, K., Tomita, M. (2010) G-language Genome - Analysis Environment with REST and SOAP Web Service Interfaces, - Nucleic Acids Res., 38, W700-W705. - -Warnings - - None. - -Diagnostic Error Messages - - None. - -Exit status - - It always exits with a status of 0. - -Known bugs - - None. - -See also - - gaminoinfo Prints out basic amino acid sequence statistics - gaaui Calculates various indece of amino acid usage - -Author(s) - - Hidetoshi Itaya (celery@g-language.org) - Institute for Advanced Biosciences, Keio University - 252-0882 Japan - - Kazuharu Arakawa (gaou@sfc.keio.ac.jp) - Institute for Advanced Biosciences, Keio University - 252-0882 Japan - -History - - 2012 - Written by Hidetoshi Itaya - 2013 - Fixed by Hidetoshi Itaya - -Target users - - This program is intended to be used by everyone and everything, from - naive users to embedded scripts. - -Comments - - None. -