annotate smart_toolShed/commons/core/seq/BioseqUtils.py @ 0:e0f8dcca02ed

Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
author yufei-luo
date Thu, 17 Jan 2013 10:52:14 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
1 # Copyright INRA (Institut National de la Recherche Agronomique)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
2 # http://www.inra.fr
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
3 # http://urgi.versailles.inra.fr
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
4 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
5 # This software is governed by the CeCILL license under French law and
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
6 # abiding by the rules of distribution of free software. You can use,
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
7 # modify and/ or redistribute the software under the terms of the CeCILL
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
8 # license as circulated by CEA, CNRS and INRIA at the following URL
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
9 # "http://www.cecill.info".
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
10 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
11 # As a counterpart to the access to the source code and rights to copy,
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
12 # modify and redistribute granted by the license, users are provided only
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
13 # with a limited warranty and the software's author, the holder of the
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
14 # economic rights, and the successive licensors have only limited
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
15 # liability.
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
16 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
17 # In this respect, the user's attention is drawn to the risks associated
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
18 # with loading, using, modifying and/or developing or reproducing the
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
19 # software by the user in light of its specific status of free software,
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
20 # that may mean that it is complicated to manipulate, and that also
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
21 # therefore means that it is reserved for developers and experienced
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
22 # professionals having in-depth computer knowledge. Users are therefore
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
23 # encouraged to load and test the software's suitability as regards their
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
24 # requirements in conditions enabling the security of their systems and/or
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
25 # data to be ensured and, more generally, to use and operate it in the
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
26 # same conditions as regards security.
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
27 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
28 # The fact that you are presently reading this means that you have had
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
29 # knowledge of the CeCILL license and that you accept its terms.
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
30
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
31
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
32 import math
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
33 import re
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
34 from commons.core.seq.Bioseq import Bioseq
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
35
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
36 ## Static methods for sequences manipulation
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
37 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
38 class BioseqUtils(object):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
39
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
40 ## Translate a nucleotide sequence
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
41 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
42 # @param bioSeqInstanceToTranslate a bioseq instance to translate
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
43 # @param phase a integer : 1 (default), 2 or 3
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
44 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
45 def translateSequence(bioSeqInstanceToTranslate, phase=1):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
46 pep = ""
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
47 #length = math.floor((len(self.sequence)-phase-1)/3)*3
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
48 length = int( math.floor( ( len(bioSeqInstanceToTranslate.sequence )-( phase-1 ) )/3 )*3 )
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
49 #We need capital letters !
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
50 bioSeqInstanceToTranslate.upCase()
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
51 sequence = bioSeqInstanceToTranslate.sequence
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
52 for i in xrange(phase-1,length,3):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
53 if (sequence[i:i+3] == "TTT" or sequence[i:i+3] == "TTC"):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
54 pep = pep + "F"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
55 elif ( sequence[i:i+3] == "TTA" or sequence[i:i+3] == "TTG" ):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
56 pep = pep + "L"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
57 elif ( sequence[i:i+2] == "CT" ):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
58 pep = pep + "L"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
59 elif ( sequence[i:i+3] == "ATT" or sequence[i:i+3] == "ATC" or sequence[i:i+3] == "ATA" ):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
60 pep = pep + "I"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
61 elif ( sequence[i:i+3] == "ATG" ):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
62 pep = pep + "M"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
63 elif ( sequence[i:i+2] == "GT" ):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
64 pep = pep + "V"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
65 elif ( sequence[i:i+2] == "TC" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
66 pep = pep + "S"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
67 elif ( sequence[i:i+2] == "CC" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
68 pep = pep + "P"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
69 elif ( sequence[i:i+2] == "AC" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
70 pep = pep + "T"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
71 elif ( sequence[i:i+2] == "GC" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
72 pep = pep + "A"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
73 elif ( sequence[i:i+3] == "TAT" or sequence[i:i+3] == "TAC" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
74 pep = pep + "Y"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
75 elif ( sequence[i:i+3] == "TAA" or sequence[i:i+3] == "TAG" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
76 pep = pep + "*"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
77 elif ( sequence[i:i+3] == "CAT" or sequence[i:i+3] == "CAC" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
78 pep = pep + "H"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
79 elif ( sequence[i:i+3] == "CAA" or sequence[i:i+3] == "CAG" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
80 pep = pep + "Q"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
81 elif ( sequence[i:i+3] == "AAT" or sequence[i:i+3] == "AAC" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
82 pep = pep + "N"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
83 elif ( sequence[i:i+3] == "AAA" or sequence[i:i+3] == "AAG" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
84 pep = pep + "K"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
85 elif ( sequence[i:i+3] == "GAT" or sequence[i:i+3] == "GAC" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
86 pep = pep + "D"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
87 elif ( sequence[i:i+3] == "GAA" or sequence[i:i+3] == "GAG" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
88 pep = pep + "E"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
89 elif ( sequence[i:i+3] == "TGT" or sequence[i:i+3] == "TGC" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
90 pep = pep + "C"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
91 elif ( sequence[i:i+3] == "TGA" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
92 pep = pep + "*"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
93 elif ( sequence[i:i+3] == "TGG" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
94 pep = pep + "W"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
95 elif ( sequence[i:i+2] == "CG" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
96 pep = pep + "R"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
97 elif ( sequence[i:i+3] == "AGT" or sequence[i:i+3] == "AGC" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
98 pep = pep + "S"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
99 elif ( sequence[i:i+3] == "AGA" or sequence[i:i+3] == "AGG" ) :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
100 pep = pep + "R"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
101 elif ( sequence[i:i+2] == "GG" ):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
102 pep = pep + "G"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
103 #We don't know the amino acid because we don't have the nucleotide
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
104 #R Purine (A or G)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
105 #Y Pyrimidine (C, T, or U)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
106 #M C or A
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
107 #K T, U, or G
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
108 #W T, U, or A
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
109 #S C or G
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
110 #B C, T, U, or G (not A)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
111 #D A, T, U, or G (not C)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
112 #H A, T, U, or C (not G)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
113 #V A, C, or G (not T, not U)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
114 #N Unknown nucleotide
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
115 elif ( re.search("N|R|Y|M|K|W|S|B|D|H|V", sequence[i:i+3])):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
116 pep = pep + "X"
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
117 bioSeqInstanceToTranslate.sequence = pep
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
118
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
119 translateSequence = staticmethod(translateSequence)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
120
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
121 ## Add the frame info in header
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
122 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
123 # @param bioSeqInstance a bioseq instance to translate
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
124 # @param phase a integer : 1 , 2 or 3
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
125 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
126 def setFrameInfoOnHeader(bioSeqInstance, phase):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
127 if " " in bioSeqInstance.header:
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
128 name, desc = bioSeqInstance.header.split(" ", 1)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
129 name = name + "_" + str(phase)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
130 bioSeqInstance.header = name + " " + desc
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
131 else:
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
132 bioSeqInstance.header = bioSeqInstance.header + "_" + str(phase)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
133
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
134 setFrameInfoOnHeader = staticmethod(setFrameInfoOnHeader)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
135
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
136 ## Translate a nucleotide sequence for all frames (positives and negatives)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
137 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
138 # @param bioSeqInstanceToTranslate a bioseq instance to translate
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
139 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
140 def translateInAllFrame( bioSeqInstanceToTranslate ):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
141 positives = BioseqUtils._translateInPositiveFrames( bioSeqInstanceToTranslate )
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
142 negatives = BioseqUtils._translateInNegativeFrames( bioSeqInstanceToTranslate )
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
143 listAll6Frames = []
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
144 listAll6Frames.extend(positives)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
145 listAll6Frames.extend(negatives)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
146 return listAll6Frames
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
147
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
148 translateInAllFrame = staticmethod(translateInAllFrame)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
149
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
150 ## Replace the stop codons by X in sequence
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
151 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
152 # @param bioSeqInstance a bioseq instance
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
153 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
154 def replaceStopCodonsByX( bioSeqInstance ):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
155 bioSeqInstance.sequence = bioSeqInstance.sequence.replace ("*", "X")
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
156
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
157 replaceStopCodonsByX = staticmethod(replaceStopCodonsByX)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
158
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
159 ## Translate in a list all the frames of all the bioseq of bioseq list
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
160 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
161 # @param bioseqList a list of bioseq instances
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
162 # @return a list of translated bioseq instances
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
163 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
164 def translateBioseqListInAllFrames( bioseqList ):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
165 bioseqListInAllFrames = []
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
166 for bioseq in bioseqList :
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
167 bioseqListInAllFrames.extend(BioseqUtils.translateInAllFrame(bioseq))
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
168 return bioseqListInAllFrames
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
169
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
170 translateBioseqListInAllFrames = staticmethod( translateBioseqListInAllFrames )
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
171
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
172 ## Replace the stop codons by X for each sequence of a bioseq list
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
173 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
174 # @param lBioseqWithStops a list of bioseq instances
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
175 # @return a list of bioseq instances
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
176 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
177 def replaceStopCodonsByXInBioseqList ( lBioseqWithStops ):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
178 bioseqListWithStopsreplaced = []
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
179 for bioseq in lBioseqWithStops:
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
180 BioseqUtils.replaceStopCodonsByX(bioseq)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
181 bioseqListWithStopsreplaced.append(bioseq)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
182 return bioseqListWithStopsreplaced
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
183
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
184 replaceStopCodonsByXInBioseqList = staticmethod( replaceStopCodonsByXInBioseqList )
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
185
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
186 ## Write a list of bioseq instances in a fasta file (60 characters per line)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
187 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
188 # @param lBioseq a list of bioseq instances
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
189 # @param fileName string
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
190 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
191 def writeBioseqListIntoFastaFile( lBioseq, fileName ):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
192 fout = open(fileName, "w")
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
193 for bioseq in lBioseq:
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
194 bioseq.write(fout)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
195 fout.close()
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
196
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
197 writeBioseqListIntoFastaFile = staticmethod( writeBioseqListIntoFastaFile )
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
198
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
199 ## read in a fasta file and create a list of bioseq instances
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
200 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
201 # @param fileName string
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
202 # @return a list of bioseq
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
203 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
204 def extractBioseqListFromFastaFile( fileName ):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
205 file = open( fileName )
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
206 lBioseq = []
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
207 currentHeader = ""
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
208 while currentHeader != None:
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
209 bioseq = Bioseq()
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
210 bioseq.read(file)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
211 currentHeader = bioseq.header
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
212 if currentHeader != None:
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
213 lBioseq.append(bioseq)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
214 return lBioseq
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
215
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
216 extractBioseqListFromFastaFile = staticmethod( extractBioseqListFromFastaFile )
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
217
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
218 ## Give the length of a sequence search by name
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
219 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
220 # @param lBioseq a list of bioseq instances
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
221 # @param seqName string
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
222 # @return an integer
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
223 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
224 def getSeqLengthWithSeqName( lBioseq, seqName ):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
225 length = 0
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
226 for bioseq in lBioseq:
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
227 if bioseq.header == seqName:
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
228 length = bioseq.getLength()
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
229 break
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
230 return length
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
231
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
232 getSeqLengthWithSeqName = staticmethod( getSeqLengthWithSeqName )
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
233
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
234 def _translateInPositiveFrames( bioSeqInstanceToTranslate ):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
235 seq1 = bioSeqInstanceToTranslate.copyBioseqInstance()
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
236 BioseqUtils.setFrameInfoOnHeader(seq1, 1)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
237 BioseqUtils.translateSequence(seq1, 1)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
238 seq2 = bioSeqInstanceToTranslate.copyBioseqInstance()
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
239 BioseqUtils.setFrameInfoOnHeader(seq2, 2)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
240 BioseqUtils.translateSequence(seq2, 2)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
241 seq3 = bioSeqInstanceToTranslate.copyBioseqInstance()
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
242 BioseqUtils.setFrameInfoOnHeader(seq3, 3)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
243 BioseqUtils.translateSequence(seq3, 3)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
244 return [seq1, seq2, seq3]
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
245
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
246 _translateInPositiveFrames = staticmethod( _translateInPositiveFrames )
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
247
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
248 def _translateInNegativeFrames(bioSeqInstanceToTranslate):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
249 seq4 = bioSeqInstanceToTranslate.copyBioseqInstance()
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
250 seq4.reverseComplement()
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
251 BioseqUtils.setFrameInfoOnHeader(seq4, 4)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
252 BioseqUtils.translateSequence(seq4, 1)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
253 seq5 = bioSeqInstanceToTranslate.copyBioseqInstance()
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
254 seq5.reverseComplement()
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
255 BioseqUtils.setFrameInfoOnHeader(seq5, 5)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
256 BioseqUtils.translateSequence(seq5, 2)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
257 seq6 = bioSeqInstanceToTranslate.copyBioseqInstance()
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
258 seq6.reverseComplement()
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
259 BioseqUtils.setFrameInfoOnHeader(seq6, 6)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
260 BioseqUtils.translateSequence(seq6, 3)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
261 return [seq4, seq5, seq6]
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
262
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
263 _translateInNegativeFrames = staticmethod( _translateInNegativeFrames )
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
264
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
265
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
266 ## Return a dictionary which keys are sequence headers and values sequence lengths.
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
267 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
268 def getLengthPerSeqFromFile( inFile ):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
269 dHeader2Length = {}
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
270 inFileHandler = open( inFile, "r" )
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
271 while True:
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
272 iBs = Bioseq()
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
273 iBs.read( inFileHandler )
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
274 if iBs.sequence == None:
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
275 break
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
276 dHeader2Length[ iBs.header ] = iBs.getLength()
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
277 inFileHandler.close()
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
278 return dHeader2Length
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
279
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
280 getLengthPerSeqFromFile = staticmethod( getLengthPerSeqFromFile )
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
281
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
282
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
283 ## Return the list of Bioseq instances, these being sorted in decreasing length
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
284 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
285 def getBioseqListSortedByDecreasingLength( lBioseqs ):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
286 return sorted( lBioseqs, key=lambda iBs: ( iBs.getLength() ), reverse=True )
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
287
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
288 getBioseqListSortedByDecreasingLength = staticmethod( getBioseqListSortedByDecreasingLength )
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
289
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
290
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
291 ## Return the list of Bioseq instances, these being sorted in decreasing length (without gaps)
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
292 #
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
293 def getBioseqListSortedByDecreasingLengthWithoutGaps( lBioseqs ):
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
294 return sorted( lBioseqs, key=lambda iBs: ( len(iBs.sequence.replace("-","")) ), reverse=True )
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
295
e0f8dcca02ed Uploaded S-MART tool. A toolbox manages RNA-Seq and ChIP-Seq data.
yufei-luo
parents:
diff changeset
296 getBioseqListSortedByDecreasingLengthWithoutGaps = staticmethod( getBioseqListSortedByDecreasingLengthWithoutGaps )