goligomersearch

 

Function

Searches oligomers in given sequence

Description

goligomersearch searches for the given oligomer in given sequence. Oligomer
can be specified using degenerate nucleotide alphabet, or by regular
expressions. Performance is optimized for fast searching.
This method changes the returning value according to the given options.

G-language SOAP service is provided by the
Institute for Advanced Biosciences, Keio University.
The original web service is located at the following URL:

http://www.g-language.org/wiki/soap

WSDL(RPC/Encoded) file is located at:

http://soap.g-language.org/g-language.wsdl

Documentation on G-language Genome Analysis Environment methods are
provided at the Document Center

http://ws.g-language.org/gdoc/

Usage

Here is a sample session with goligomersearch

% goligomersearch refseqn:NC_000913 atgcatgc
Searches oligomers in given sequence
Program compseq output file [nc_000913.goligomersearch]: 

Go to the input files for this example
Go to the output files for this example

Command line arguments

Qualifier Type Description Allowed values Default
Standard (Mandatory) qualifiers
[-sequence]
(Parameter 1)
seqall Nucleotide sequence(s) filename and optional format, or reference (input USA) Readable sequence(s) Required
[-oligomer]
(Parameter 2)
string Oligomer to search Any string  
[-outfile]
(Parameter 3)
outfile Program compseq output file Output file <*>.goligomersearch
Additional (Optional) qualifiers
(none)
Advanced (Unprompted) qualifiers
-return selection 'position' to return list of positions where oligomers are found, 'oligo' to return list of oligomers found ordered by positions, 'both' to return a hash with positions as keys and oligomers as values, 'distribution' to return four values about the distribution of given oligomer Choose from selection list of values position
-[no]accid boolean Include to use sequence accession ID as query Boolean value Yes/No Yes

Input file format

The database definitions for following commands are available at
http://soap.g-language.org/kbws/embossrc

goligomersearch reads one or more nucleotide sequences.

Output file format

The output from goligomersearch is to a plain text file.

File: nc_000913.goligomersearch

Sequence: NC_000913 Oligomer: atgcatgc Return: 147018,366819,653138,863326,1288615,1627117,2111200,2246695,2697278,2750962,2826906,2882353,2998362,3022134,3346029,3477018,3629113,3842819,3958304,3982183,4013480,4285578,4474663,4484501,4499080,4604562,4638391

Data files

None.

Notes

None.

References

   Arakawa, K., Mori, K., Ikeda, K., Matsuzaki, T., Konayashi, Y., and
      Tomita, M. (2003) G-language Genome Analysis Environment: A Workbench
      for Nucleotide Sequence Data Mining, Bioinformatics, 19, 305-306.

   Arakawa, K. and Tomita, M. (2006) G-language System as a Platform for
      large-scale analysis of high-throughput omics data, J. Pest Sci.,
      31, 7.

   Arakawa, K., Kido, N., Oshita, K., Tomita, M. (2010) G-language Genome
      Analysis Environment with REST and SOAP Web Service Interfaces,
      Nucleic Acids Res., 38, W700-W705.

Warnings

None.

Diagnostic Error Messages

None.

Exit status

It always exits with a status of 0.

Known bugs

None.

See also

Program name Description
gkmertable Create an image showing all k-mer abundance within a sequence
gnucleotideperiodicity Checks the periodicity of certain oligonucleotides
goligomercounter Counts the number of given oligomers in a sequence
gsignature Calculate oligonucleotide usage (genomic signature)

Author(s)

Hidetoshi Itaya (celery@g-language.org)
  Institute for Advanced Biosciences, Keio University
  252-0882 Japan

Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
  Institute for Advanced Biosciences, Keio University
  252-0882 Japan

History

2012 - Written by Hidetoshi Itaya

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scrips.

Comments

None.